Nucleic acids for detecting colon cancer related gene (CCRG) and methods of use

ABSTRACT

Nucleic acids and polypeptides correlated with cancer are disclosed. Also disclosed are methods of detecting cancer in a biological sample by determining expression of a colon carcinoma related gene (CCRG) or protein in that sample.

CROSS REFERENCE TO RELATED APPLICATION

[0001] The present application claims the benefit of U.S. ProvisionalApplication No. 60/200,292, filed Apr. 28, 2000.

STATEMENT AS TO FEDERALLY SPONSORED RESEARCH

[0002] Not applicable.

FIELD OF THE INVENTION

[0003] The invention relates generally to the fields of molecularbiology, genomics, bioinformatics, pathology, and medicine. Moreparticularly, the invention relates to a gene whose expression ismodulated in select cancers.

BACKGROUND

[0004] With the recent efforts to sequence the entire human genome, thenucleotide sequences of more than 100,000 human genes are expected to beknown within the next few years. See, e.g., Robbins, R. J., J. Computat.Biol., 3: 465-478, 1996; Andrade, M. A. and Sander, C., Curr. Opin.Biotechnol., 8: 675-683, 1997; and Collins et al., Science, 282:682-689, 1998. Once characterized, these genes are anticipated to beuseful for identifying new diagnostic and therapeutic targets for avariety of different diseases. Fannon, M. R., Trends Biotechnol., 14:294-298, 1996. Already several attempts have been made to identify genesor gene products that are uniquely expressed in diseased tissue. Theresults of these efforts indicated that pathology correlates more oftenwith the pattern of gene expression in the diseased tissue rather thansimply with the absence or presence of a particular gene.

SUMMARY

[0005] The invention relates to the discovery of specific polynucleotidesequences that are upregulated in select cancer cells as compared tonon-diseased cells. In particular, several expressed sequence tags(ESTs) more prevalent in cancer tissue libraries than in correspondingnon-cancerous tissue libraries were identified. These ESTs were thenused to identify specific UniGene clusters associated with cancer. See,Schuler, J. Mol. Med. 75(10), 694-698, 1998; Schuler et al., Science274, 540-546, 1996; and Boguski & Schuler, Nature Genetics 10, 369-371,1995. Based on the identified polynucleotide sequences, a partial genesequence termed C4, whose expression is selectively upregulated in colontumors was identified. Using this partial sequence, a full length gene,termed CCRG (Colon Carcinoma Related Gene) containing the C4 sequencewas isolated and sequenced.

[0006] An open reading frame of the CCRG gene encodes a polypeptide,i.e., the CCRG protein, which was predicted to have a signal peptidesequence, and putative phosphorylation, myristylation, and glycosylationsites. Based on comparisons to sequences of known function, thenucleotide sequence of CCRG (and C4) was predicted to encode aprokaryotic lipoprotein binding site and a prenylation site. TheC-terminus of the CCRG protein is cysteine rich and contains a motiffound in ultra high sulphur matrix protein, hair keratin,metallothionein and cation transporters. Using the secondary structureprediction program provided by the ExPASy proteomics server(http://www.expasy.ch) by the Swiss Institute of Bioinformatics(Geneva). CCRG protein was predicted to contain mostly a mixture ofalpha helices, beta strands, and coils. The mature CCRG protein has atheoretical molecular weight of 8.62 kDa and a pI of 8.05. These andother analyses indicated that CCRG protein is a colon tumor associatedsecreted factor.

[0007] Accordingly, the invention features a purified nucleic acidpresent at higher levels in colon cancer cells than in non-cancerouscolon cells and includes a nucleotide sequence that encodes apolypeptide sharing at least 80% sequence identity with SEQ ID NO:7 orwith a fragment of SEQ ID NO:7 at least 20 residues in length. Thenucleotide sequence can be one that defines a polynucleotide whosecomplement hybridizes under high stringency conditions to the nucleotidesequence of SEQ ID NO:6. The polypeptide encoded by the nucleic acid canhave an amino acid sequence consisting of SEQ ID NO:7 or a fragment ofSEQ ID NO:7 at least 20 residues in length. The nucleic acid can includea fragment of the polynucleotide sequence of SEQ ID NO:6 at least 50residues long (e.g., one including the polynucleotide sequence of SEQ IDNO:6).

[0008] Also within the invention is a vector including a purifiednucleic acid present at higher levels in colon cancer cells than innon-cancerous colon cells, the purified nucleic acid including anucleotide sequence that encodes a polypeptide sharing at least 80%sequence identity with SEQ ID NO:7 or with a fragment of SEQ ID NO:7 atleast 20 residues in length. The nucleic acid contained within thisvector can be operably linked to one or more expression controlsequences. In another aspect, the invention features a cell including avector of the invention. including a purified nucleic acid present athigher levels in colon cancer cells than in non-cancerous colon cells.

[0009] The invention also provides a probe including an oligonucleotideand a detectable label attached to the oligonucleotide, theoligoiucleotide being at least 15 nucleotides in length and hybridizingunder high stringency conditions to the nucleotide sequence of SEQ IDNO:7 or a complement of the nucleotide sequence of SEQ ID NO:7.

[0010] A kit for detecting a purified nucleic acid including anucleotide sequence that encodes a polypeptide sharing at least 80%sequence identity with SEQ ID NO:7 or with a fragment of SEQ ID NO:7 atleast 20 residues in length in a cell is also within the invention. Thekit includes: a first PCR primer including a first nucleic acid moleculeincluding the nucleotide sequence of SEQ ID NO:2 or SEQ ID NO:9, and asecond PCR primer including a second nucleic acid molecule including thenucleotide sequence of SEQ ID NO:3 or SEQ ID NO:10.

[0011] The invention also features a purified polypeptide expressed athigher levels by colon cancer cells than by non-cancerous colon cells.The purified polypeptide includes an amino acid sequence that shares atleast 80% sequence identity with SEQ ID NO:7 or a fragment of SEQ IDNO:7 at least 20 residues in length, e.g., one including a fragment ofSEQ ID NO:7 at least 20 residues in length or one including residues31-111 of the amino acid sequence of SEQ ID NO:7. The purifiedpolypeptide can also include the amino acid sequence of SEQ ID NO:7.

[0012] A purified antibody that specifically binds to a polypeptideincluding an amino acid sequence that shares at least 80% sequenceidentity with SEQ ID NO:7 or a fragment of SEQ ID NO:7 at least 20residues in length is featured in the invention. This antibody caninclude a detectable label.

[0013] In further aspect, the invention provides a method of producing aCCRG polypeptide. This method includes the steps of: (a) providing acell transformed with a purified nucleic acid including a nucleotidesequence that encodes a CCRG polypeptide sharing at least 80% sequenceidentity with SEQ ID NO:7; (b) culturing the cell under conditions thatallow expression of the CCRG polypeptide; and (c) collecting the CCRGpolypeptide from the cultured cell.

[0014] A screening method for identifying a substance that modulatesexpression of a gene encoding a CCRG polypeptide sharing at least 80%sequence identity with SEQ ID NO:7 is also within the invention. Thismethod includes the steps of: (a) providing a test cell that includesthe gene encoding a CCRG polypeptide sharing at least 80% sequenceidentity with SEQ ID NO:7; (b) contacting the test cell with a candidatesubstance; and (c) detecting an increase or decrease in the expressionlevel of the gene encoding the CCRG polypeptide in the presence of thecandidate substance, compared to the expression level of the geneencoding CCRG polypeptide in the absence of the candidate substance, asan indication that the candidate substance modulates the level ofexpression of the gene encoding the CCRG polypeptide.

[0015] In addition, the invention provides a method for isolating asubstance that binds a CCRG polypeptide sharing at least 80% sequenceidentity with SEQ ID NO:7. This method includes the steps of: (a)providing a sample of the CCRG polypeptide immobilized on asubstrate;(b) contacting a mixture containing the CCRGpolypeptide-binding substance with the immobilized CCRG polypeptide; (c)separating unbound components of the mixture from bound components ofthe mixture; and (d) recovering the CCRG polypeptide-binding substancefrom the immobilized CCRG polypeptide.

[0016] A method for detecting the presence of a CCRG nucleic acid orpolypeptide in a biological sample is also included within theinvention. This method includes the steps of: (a) providing thebiological sample; and (b) detecting the presence of the CCRG nucleicacid or polypeptide in the biological sample. In one variation of thismethod, step (b) of detecting the presence of the CCRG nucleic acid orpolypeptide in a biological sample includes: contacting the biologicalsample with a probe that binds to the CCRG nucleic acid or polypeptide;and detecting binding of the probe to the biological sample.. In anothervariation of this method, step (b) of detecting the presence of the CCRGnucleic acid or polypeptide in a biological sample includes: isolatingRNA from the biological sample; generating cDNAs from the isolated RNA;contacting the cDNAs with a first PCR primer that hybridizes to a firstportion of a polynucleotide sharing at least 80% sequence identity withSEQ ID NO:6 or a complement of SEQ ID NO:6, and a second PCR primer thathybridizes to a second portion of a polynucleotide sharing at least 80%sequence identity with SEQ ID NO:6 or a complement of SEQ ID NO:6 toform a mixture; subjecting the mixture to reversetranscriptase-polymerase chain reaction to generate PCR amplificationproducts; and analyzing the PCR amplification products by gelelectrophoresis.

[0017] Also within the invention is a method for detecting the presenceof a colon cancer cell in a biological sample. This method includes thesteps of: (a) providing the biological sample; and (b) analyzing thebiological sample for the presence of a molecule selected from the groupconsisting of: a nucleic acid at least 15 nucleotides in length thathybridizes under stringent conditions to the nucleic acid of SEQ ID NO:6or the complement of SEQ ID NO:6, and a polypeptide sharing at least 80%sequence identity with SEQ ID NO:7. Presence of the molecule in thebiological sample indicates that the sample contains a colon cancercell.

[0018] The invention also provides a method for detecting the presenceof a CCRG protein in a biological sample. This method includes the stepsof: (a) providing the biological sample; and (b) analyzing thebiological sample for the presence of a polypeptide including an aminoacid sequence that shares at least 80% sequence identity with SEQ IDNO:7 or a fragment of SEQ ID NO:7 at least 20 residues in length.Presence of the polypeptide in the biological sample indicates that thesample contains the CCRG protein. In one variation of this method, thestep (b) of analyzing the biological sample for the presence of apolypeptide including an amino acid sequence that shares at least 80%sequence identity with SEQ ID NO:7 or a fragment of SEQ ID NO:7 at least20 residues in length includes contacting the biological sample with anantibody that specifically binds to a polypeptide including an aminoacid sequence that shares at least 80% sequence identity with SEQ IDNO:7 or a fragment of SEQ ID NO:7 at least 20 residues in length.

[0019] In the foregoing methods, the biological sample can be a cellderived from a colon (e.g., a human colon), feces, urine, blood, plasma,or serum.

[0020] Unless otherwise defined, all technical terms used herein havethe same meaning as commonly understood by one of ordinary skill in theart to which this invention belongs. Commonly understood definitions ofmolecular biology terms can be found in Rieger et al., Glossary ofGenetics: Classical and Molecular, 5th edition, Springer-Verlag: NewYork, 1991; and Lewin, Genes V, Oxford University Press: New York, 1994.

[0021] By the term “gene” is meant a nucleic acid molecule that codes,for a particular protein, or in certain cases, a functional orstructural RNA molecule. For example, the CCRG gene encodes the CCRGprotein.

[0022] As used herein, a “nucleic acid” or a “nucleic acid molecule”means a chain of two or more nucleotides such as RNA (ribonucleic acid)and DNA (deoxyribonucleic acid). A “purified” nucleic acid molecule isone that has been substantially separated or isolated away from othernucleic acid sequences in a cell or organism in which the nucleic acidnaturally occurs (e.g., 30, 40, 50, 60, 70, 80, 90, 95, 96, 97, 98, 99,100% free of contaminants). The term includes, e.g., a recombinantnucleic acid molecule incorporated into a vector, a plasmid, a virus, ora genome of a prokaryote or eukaryote. Examples of purified nucleicacids include cDNAs, fragments of genomic nucleic acids, nucleic acidsproduced polymerase chain reaction (PCR), nucleic acids formed byrestriction enzyme treatment of genomic nucleic acids, recombinantnucleic acids, and chemically synthesized nucleic acid molecules. A“recombinant” nucleic acid molecule is one made by an artificialcombination of two otherwise separated segments of sequence, e.g., bychemical synthesis or by the manipulation of isolated segments ofnucleic acids by genetic engineering techniques.

[0023] By the terms “CCRG gene,” “CCRG polynucleotide,” or “CCRG nucleicacid” is meant a native CCRG-encoding nucleic acid sequence, e.g., thenative CCRG cDNA (as shown in FIG. 6); a nucleic acid having sequencesfrom which CCRG cDNA can be transcribed; and/or allelic variants andhomologs of the foregoing. The terms encompass double-stranded DNA,single-stranded DNA, and RNA.

[0024] As used herein, “protein” or “polypeptide” are used synonymouslyto mean any peptide-linked chain of amino acids, regardless of length orpost-translational modification, e.g., glycosylation or phosphorylation.An “purified” polypeptide is one that has been substantially separatedor isolated away from other polypeptides in a cell or organism in whichthe polypeptide naturally occurs (e.g., 30, 40, 50, 60, 70, 80, 90, 95,96, 97, 98, 99, 100% free of contaminants).

[0025] By the terms “CCRG protein” or “CCRG polypeptide” is meant anexpression product of an CCRG gene such as the native CCRG protein ofFIG. 7 (SEQ ID NO:7) or FIG. 8 (amino acid residues 31-11 of SEQ IDNO:7) or a protein that shares at least 65% (but preferably 75, 80, 85,90, 95, 96, 97, 98, or 99%) amino acid sequence identity with theprotein of FIG. 7 or FIG. 8 and displays a functional activity of CCRG.A “functional activity” of a protein is any activity associated with thephysiological function of the protein. For example, functionalactivities of CCRG may include selective expression in certainneoplastic tissues. In addition, the expression of CCRG in the smallintestine suggests that it may be an autocrine secreted growth factor inthe intestine and that its overexpression in the large intestine (colon)may contribute to tumor formation.

[0026] When referring to a nucleic acid molecule or polypeptide, theterm “native” refers to a naturally-occurring (e.g., a “wild-type”)nucleic acid or polypeptide. A “homolog” of a CCRG gene is a genesequence encoding a CCRG polypeptide isolated from an organism otherthan a human being. Similarly, a “homolog” of a native CCRG polypeptideis an expression product of a CCRG homolog.

[0027] A “fragment” of a CCRG nucleic acid is a portion of a CCRGnucleic acid that is less than full-length and comprises at least aminimum length capable of hybridizing specifically with a native CCRGnucleic acid under stringent hybridization conditions. The length ofsuch a fragment is preferably at least 15 nucleotides, more preferablyat least 20 nucleotides, and most preferably at least 30 nucleotides ofa native CCRG nucleic acid sequence. A “fragment” of a CCRG polypeptideis a portion of a CCRG polypeptide-that is less than full-length (e.g.,a polypeptide consisting of 5, 10, 15, 20, 30, 40, 50, 75, 100 or moreamino acids of native CCRG polypeptide), and preferably retains at leastone functional activity of native CCRG polypeptide. For example, apolypeptide consisting of amino acids 31-111 of the native CCRGpolypeptide (i.e., the polypeptide of SEQ ID NO:7 without the signalpeptide) is a fragment of the full length native CCRG polypeptide.

[0028] When referring to hybridization of one nucleic to another, “lowstringency conditions” means in 10% formamide, 5× Denhart's solution,6×SSPE, 0.2% SDS at 42° C., followed by washing in 1×SSPE, 0.2% SDS, at50° C.; “moderate stringency conditions” means in 50% formamide, 5×Denhart's solution, 5×SSPE, 0.2% SDS at 42° C., followed by washing in0.2×SSPE, 0.2% SDS, at 65° C.; and “high stringency conditions” means in50% formamide, 5× Denhart's solution, 5×SSPE, 0.2% SDS at 42° C.,followed by washing in 0.1×SSPE, and 0.1% SDS at 65° C. The phrase“stringent hybridization conditions” means low, moderate, or highstringency conditions.

[0029] As used herein, “sequence identity” means the percentage ofidentical subunits at corresponding positions in two sequences when thetwo sequences are aligned to maximize subunit matching, i.e., takinginto account gaps and insertions. When a subunit position in both of thetwo sequences is occupied by the same monomeric subunit, e.g., if agiven position is occupied by an adenine in each of two DNA molecules,then the molecules are identical at that position. For example, if 7positions in a sequence 10 nucleotides in length are identical to thecorresponding positions in a second 10-nucleotide sequence, then the twosequences have 70% sequence identity. As another example, if 12positions in a protein sequence 20 amino acids in length are identicalto the corresponding positions in a second 20-amino acid sequence, thenthe two sequences have 60% sequence identity. Preferably, the length ofthe compared nucleic acid sequences is at least 60 nucleotides, morepreferably at least 75 nucleotides, and most preferably 100 nucleotides;and the length of compared polypeptide sequences is at least 15, 25, and50 amino acids. Sequence identity is typically measured using sequenceanalysis software (e.g., Sequence Analysis Software Package of theGenetics Computer Group, University of Wisconsin Biotechnology Center,1710 University Avenue, Madison, Wis. 53705).

[0030] When referring to mutations in a nucleic acid molecule, “silent”changes are those that substitute of one or more base pairs in thenucleotide sequence, but do not change the amino acid sequence of thepolypeptide encoded by the sequence. “Conservative” changes are those inwhich at least one codon in the protein-coding region of the nucleicacid has been changed such that at least one amino acid of thepolypeptide encoded by the nucleic acid sequence is substituted withanother amino acid having similar characteristics. Examples ofconservative amino acid substitutions are ser for ala, thr, or cys; lysfor arg; gln for asn, his, or lys; his for asn; glu for asp or lys; asnfor his or gin; asp for glu; pro for gly; leu for ile, phe, met, or val;val for ile or leu; ile for leu, met, or vat; arg for lys; met for phe;tyr for phe or trp; thr for ser; trp for tyr; and phe for tyr.

[0031] As used herein, the term “vector” refers to a nucleic acidmolecule capable of transporting another nucleic acid to which it hasbeen linked. One type of preferred vector is an episome, i.e., a nucleicacid capable of extra-chromosomal replication. Preferred vectors arethose capable of autonomous replication and/expression of nucleic acidsto which they are linked. Vectors capable of directing the expression ofgenes to which they are operatively linked are referred to herein as“expression vectors.”

[0032] A first nucleic acid sequence is “operably” linked with a secondnucleic acid sequence when the first nucleic acid sequence is placed ina functional relationship with the second nucleic acid sequence. Forinstance, a promoter is operably linked to a coding sequence if thepromoter affects the transcription or expression of the coding sequence.Generally, operably linked nucleic acid sequences are contiguous and,where necessary to join two protein coding regions, in reading frame.

[0033] A cell, tissue, or organism into which has been introduced aforeign nucleic acid, such as a recombinant vector, is considered“transformed,” “transfected,” or “transgenic.” “A “transgenic” or“transformed” cell or organism (e.g., a mammal) also includes progeny ofthe cell or organism. For example, an organism transgenic for CCRG isone in which CCRG nucleic acid has been introduced.

[0034] By the term “CCRG-specific antibody” is meant an antibody thatbinds a CCRG protein (e.g., a protein having the amino acid sequence ofSEQ ID NO:7), and displays no substantial binding to other naturallyoccurring proteins other than those sharing the same antigenicdeterminants as a CCRG protein. The term includes polyclonal andmonoclonal antibodies.

[0035] As used herein, “bind,” “binds,” or “interacts with” means thatone molecule recognizes and adheres to a particular second molecule in asample, but does not substantially recognize or adhere to otherstructurally unrelated molecules in the sample. Generally, a firstmolecule that “specifically binds” a second molecule has a bindingaffinity greater than about 10⁵ to 10⁶ moles/liter for that secondmolecule.

[0036] The term “labeled,” with regard to a probe or antibody, isintended to encompass direct labeling of the probe or antibody bycoupling (i.e., physically linking) a detectable substance to the probeor antibody, as well as indirect labeling of the probe or antibody byreactivity with another reagent that is directly labeled. Examples ofindirect labeling include detection of a primary antibody using afluorescently labeled secondary antibody and end-labeling of a DNA probewith biotin such that it can be detected with fluorescently labeledstreptavidin. Although methods and materials similar or equivalent tothose described herein can be used in the practice or testing of thepresent invention, suitable methods and materials are described below.All publications, patent applications, patents, and other referencesmentioned herein are incorporated by reference in their entirety. In thecase of conflict, the present specification, including definitions willcontrol. In addition, the particular embodiments discussed below areillustrative only and not intended to be limiting.

BRIEF DESCRIPTION OF THE DRAWINGS

[0037] The invention is pointed out with particularity in the appendedclaims. The above and the further advantages of this invention may bebetter understood by referring to the following description taken inconjunction with the accompanying drawings, in which:

[0038]FIG. 1 is a photograph of an ethidium bromide-stained agarose gelafter electrophoresis of cDNAs from a matched set of tumor and normaltissues. The tissues were analyzed for expression of CCRG (C4) and actingene. M=100 bp ladder; negative=template minus control; ±RT=cDNAs madein the presence or absence of reverse transcriptase; PBL=genomic DNAfrom peripheral blood lymphocytes.

[0039]FIG. 2 is a photograph of an ethidium bromide-stained agarose gelafter electrophoresis of cDNAs obtained from normal human tissues andanalyzed by RT-PCR using SEQ ID NOs:2 and 3 as PCR primers. The actingene was used as an internal control. M=100 bp ladder marker;Negative=template minus control.

[0040]FIG. 3 is a photograph of an ethidium bromide-stained agarose gelafter electrophoresis of cDNAs obtained from normal and tumor breast,lung, ovary, pancreas and prostate and analyzed by RT-PCR using SEQ IDNOs: 2 and 3 as PCR primers. The actin gene was used as an internalcontrol. M=100 bp ladder; Negative=template minus control;Positive=colon tumor cDNA.

[0041]FIG. 4 is a photograph of an ethidium bromide-stained agarose gelafter electrophoresis of cDNAs obtained from matched sets of tumor andnormal colon samples from five different patients and analyzed by RT-PCRusing SEQ ID NOs: 2 and 3 as PCR primers. The actin gene was used as aninternal control. M=100 bp ladder; Negative=template minus control.

[0042]FIG. 5 is an autoradiograph of a blot of cDNAs obtained from amatched set of tumor and normal colon tissue samples. The cDNAs werePCR-amplified using SEQ ID NOs: 2 and 3 as the PCR primers.Amplification products were transferred to a nitrocellulose filter. Thefilter was hybridized to an end-labeled oligonucleotide (SEQ ID NO: 4)probe and autoradiographed. ±RT=cDNAs made with or without RT;Negative=template minus PCR control; T=colon tumor; N=normal colon; andPBL=genomic DNA from peripheral blood lymphocytes.

[0043]FIG. 6 is the nucleotide sequence of the native CCRG gene.

[0044]FIG. 7 is the amino acid sequence of the processed form (i.e.,without the signal peptide) of native CCRG protein.

[0045]FIG. 8 is the amino acid sequence of the unprocessed form (i.e.,with the signal peptide) of native CCRG protein.

DETAILED DESCRIPTION

[0046] The invention encompasses compositions and methods relating tothe CCRG gene, a human gene associated with cancer. The below describedpreferred embodiments illustrate adaptations of these compositions andmethods. Nonetheless, from the description of these embodiments, otheraspects of the invention can be made and/or practiced based on thedescription provided below.

[0047] Biological Methods

[0048] Methods involving conventional molecular biology techniques aredescribed herein. Such techniques are generally known in the art and aredescribed in detail in methodology treatises such as Molecular Cloning:A Laboratory Manual, 2nd ed., vol. 1-3, ed. Sambrook et al., Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989; and CurrentProtocols in Molecular Biology, ed. Ausubel et al., Greene Publishingand Wiley-Interscience, New York, 1992 (with periodic updates). Varioustechniques using polymerase chain reaction (PCR) are described, e.g., inInnis et al., PCR Protocols: A Guide to Methods and Applications,Academic Press: San Diego, 1990. PCR-primer pairs can be derived fromknown sequences by known techniques such as using computer programsintended for that purpose (e.g., Primer, Version 0.5, ©1991, WhiteheadInstitute for Biomedical Research, Cambridge, Mass.). The ReverseTranscriptase Polymerase Chain Reaction (RT-PCR) method used to identifyand amplify certain polynuleotide sequences within the invention wasperformed as described in Elek et al., In Vivo, 14:172-182, 2000).Methods for chemical synthesis of nucleic acids are discussed, forexample, in Beaucage and Carruthers, Tetra. Letts. 22:1859-1862, 1981,and Matteucci et al., J. Am. Chem. Soc. 103:3185, 1981. Chemicalsynthesis of nucleic acids can be performed, for example, on commercialautomated oligonucleotide synthesizers. Immunological methods (e.g.,preparation of antigen-specific antibodies, immunoprecipitation, andimmunoblotting) are described, e.g., in Current Protocols in Immunology,ed. Coligan et al., John Wiley & Sons, New York, 1991; and Methods ofImmunological Analysis, ed. Masseyeff et al., John Wiley & Sons, NewYork, 1992. Conventional methods of gene transfer and gene therapy canalso be adapted for use in the present invention. See, e.g., GeneTherapy: Principles and Applications, ed. T. Blackenstein, SpringerVerlag, 1999; Gene Therapy Protocols (Methods in Molecular Medicine),ed. P. D. Robbins, Humana Press, 1997; and Retro-vectors for Human GeneTherapy, ed. C. P. Hodgson, Springer Verlag, 1996.

[0049] Nucleic Acids Encoding CCRG

[0050] The present invention utilizes the human CCRG gene, which has nowbeen cloned and sequenced. A preferred nucleic acid molecule of for usein the invention is the native CCRG polynucleotide shown in FIG. 6 (SEQID NO:6) and deposited with Genbank as Accession No. AF323921. [Theclone G6 containing the full length CCRG gene (SEQ ID NO:6) in the PEAK8 expression vector (Edge Biosystems) has been deposited with theAmerican Type Culture Collection (Rockville, Md.) as Accession No.______. Another nucleic acid that can be used in various aspects of theinvention includes a purified nucleic acid (polynucleotide) that encodesa polypeptide having either the amino acid sequence of FIG. 7 (SEQ IDNO:7) or the amino acid sequence of FIG. 8 (amino acid residues 31-111of SEQ ID NO:7). As the native CCRG gene was originally cloned from asmall intestine, cDNA library nucleic acid molecules encoding apolypeptide of the present invention can be obtained from such a libraryor from any human colon tumor tissue itself by conventional cloningmethods such as those described herein.

[0051] Nucleic acid molecules utilized in the present invention may bein the form of RNA or in the form of DNA (e.g., cDNA, genomic DNA, andsynthetic DNA). The DNA-may be double-stranded or single-stranded, andif single-stranded may be the coding (sense) strand or non-coding(anti-sense) strand. The coding sequence which encodes the native CCRGprotein may be identical to the nucleotide sequence shown in FIG. 6 (SEQID NO:6). It may also be a different coding sequence which, as a resultof the redundancy or degeneracy of the genetic code, encodes the samepolypeptide as the polynucleotide of SEQ ID NO:6.

[0052] Other nucleic acid molecules within the invention are variants ofthe native CCRG gene such as those that encode fragments (e.g.,post-translationally processed forms of), analogs and derivatives of anative CCRG protein. Such variants may be, e.g., a naturally occurringallelic variant of the native CCRG gene, a homolog of the native CCRGgene, or a non-naturally occurring variant of the native. CCRG gene.These variants have a nucleotide sequence that differs from the nativeCCRG gene in one or more bases. For example, the nucleotide sequence ofsuch variants can feature a deletion, addition, or substitution of oneor more nucleotides of the native CCRG gene. Nucleic acid insertions arepreferably of about 1 to 10 contiguous nucleotides, and deletions arepreferably of about 1 to 30 contiguous nucleotides.

[0053] In other applications, variant CCRG proteins displayingsubstantial changes in structure can be generated by making nucleotidesubstitutions that cause less than conservative changes in the encodedpolypeptide. Examples of such nucleotide substitutions are those thatcause changes in (a) the structure of the polypeptide backbone (b) thecharge or hydrophobicity of the polypeptide; or (c) the bulk of an aminoacid side chain. Nucleotide substitutions generally expected to producethe greatest changes in protein properties are those that causenon-conservative changes in codons. Examples of codon changes that arelikely to cause major changes in protein structure are those that causesubstitution of (a) a hydrophilic residue, e.g., serine or threonine,for (or by) a hydrophobic residue, e.g., leucine, isoleucine,phenylalanine, valine or alanine; (b) a cysteine or proline for (or by)any other residue; (c) a residue having an electropositive side chain,e.g., lysine, arginine, or histadine, for (or by) an electronegativeresidue, e.g., glutamine or aspartine; or (d) a residue having a bulkyside chain, e.g., phenylalanine, for (or by) one not having a sidechain, e.g., glycine.

[0054] Naturally occurring allelic variants of the native CCRG genewithin the invention are nucleic acids isolated from human tissue thathave at least 75% (e.g., 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%,85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,and 99%) sequence identity with the native CCRG gene, and encodepolypeptides having structural similarity to native CCRG protein.Homologs of the native CCRG gene within the invention are nucleic acidsisolated from other species that have at least 75% (e.g., 76%, 77%, 78%,79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%,93%, 94%, 95%, 96%, 97%, 98%, and 99%) sequence identity with the nativeCCRG gene, and encode polypeptides having structural similarity tonative CCRG protein. Public and/or proprietary nucleic acid databasescan be searched in an attempt to identify other nucleic acid moleculeshaving a high percent (e.g., 70, 80, 90% or more) sequence identity tothe native CCRG gene.

[0055] Non-naturally occurring CCRG gene variants are nucleic acids thatdo not occur in nature (e.g., are made by the hand of man), have atleast 75% (e.g., 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, and 99%)sequence identity with the native CCRG gene, and encode polypeptideshaving structural similarity to native CCRG protein. Examples ofnon-naturally occurring CCRG gene variants are those that encode afragment of a CCRG protein, those that hybridize to the native CCRG geneor a complement of to the native CCRG gene under stringent conditions,those that share at least 65% sequence identity with the native CCRGgene or a complement of the native CCRG gene, and those that encode aCCRG fusion protein.

[0056] Nucleic acids encoding fragments of native CCRG protein withinthe invention are those that encode, e.g., 2, 5, 10, 25, 30, 40, 50, 60,70, 80, 90, 100, or more amino acid residues of the native CCRG protein.Shorter oligonucleotides (e.g., those of 6, 7, 8, 9, 10, 11, 12, 13, 14,15, 16, 17, 18, 19, 20, 30, 50, 100, base pairs in length) that encodeor hybridize with nucleic acids that encode fragments of the native CCRGprotein can be used as probes, primers, or antisense molecules. Longerpolynucleotides (e.g., those of 125, 150, 175, 200, 225, 250, 275, 300,or more base pairs) that encode or hybridize with nucleic acids thatencode fragments of native CCRG protein can also be used in variousaspects of the invention. Nucleic acids encoding fragments of nativeCCRG protein can be made by enzymatic digestion (e.g., using arestriction enzyme) or chemical degradation of the full length nativeCCRG gene or variants thereof.

[0057] Nucleic acids that hybridize under stringent conditions to thenucleic acid of SEQ ID NO:6 or the complement of SEQ ID NO:6 can also beused in the invention. For example, such nucleic acids can be those thathybridize to SEQ ID NO:6 or the complement of SEQ ID NO:6 under lowstringency conditions, moderate stringency conditions, or highstringency. conditions are within the invention. Preferred suchnucleotide acids are those having a nucleotide sequence that is thecomplement of all or a portion of SEQ ID NO:6. Other variants of thenative CCRG gene within the invention are polynucleotides that share atleast 65% (e.g., 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98,and 99%) sequence identity to SEQ ID NO:6 or the complement of SEQ IDNO:6. Nucleic acids that hybridize under stringent conditions to orshare at least 65% sequence identity with SEQ ID NO:6 or the complementof SEQ ID NO:6 can be obtained by techniques known in the art such as bymaking mutations in the native CCRG gene, or by isolation from anorganism expressing such a nucleic acid (e.g., an allelic variant).

[0058] Nucleic acid molecules encoding CCRG fusion proteins are alsowithin the invention. Such nucleic acids can be made by preparing aconstruct (e.g., an expression vector) that expresses a CCRG fusionprotein when introduced into a suitable host. For example, such aconstruct can be made by ligating a first polynucleotide encoding a CCRGprotein fused in frame with a second polynucleotide encoding anotherprotein (e.g., a detectable label or a cytotoxin) such that expressionof the construct in a suitable expression system yields a fusionprotein.

[0059] The oligonucleotides of the invention can be DNA or RNA orchimeric mixtures or derivatives or modified versions thereof,single-stranded or double-stranded. Such oligonucleotides can bemodified at the base moiety, sugar moiety, or phosphate backbone, forexample, to improve stability of the molecule, hybridization, etc.Oligonucleotides within the invention may additionally include otherappended groups such as peptides (e.g., for targeting host cellreceptors in vivo), or agents facilitating transport across the cellmembrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci.U.S.A. 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA84:648-652; PCT Publication No. WO 88/09810, published Dec. 15, 1988),hybridization-triggered cleavage agents. (See, e.g., Krol et al. (1988)BioTechniques 6:958-976) or intercalating agents. (See, e.g, Zon (1988)Pharm. Res. 5:539-549). To this end, the oligonucleotides may beconjugated to another molecule, e.g., a peptide, hybridization triggeredcross-linking agent, transport agent, hybridization-triggered cleavageagent, etc.

[0060] Using the nucleotide of the native CCRG gene and the amino acidsequence of a native CCRG protein, those skilled in the art can createnucleic acid molecules that have minor variations in their nucleotide,by, for example, standard nucleic acid mutagenesis techniques or bychemical synthesis. Variant CCRG nucleic acid molecules can be expressedto produce variant CCRG proteins.

[0061] Antisense, Ribozyme, Triplex Techniques

[0062] Another aspect of the invention relates to the use of purifiedantisense nucleic acids to inhibit expression of CCRG. Antisense nucleicacid molecules within the invention are those that specificallyhybridize (e.g. bind) under cellular conditions to cellular MRNA and/orgenomic DNA encoding a CCRG protein in a manner that inhibits expressionof the CCRG protein, e.g., by inhibiting transcription and/ortranslation. The binding may be by conventional base paircomplementarity, or, for example, in the case of binding to DNAduplexes, through specific interactions in the major groove of thedouble helix.

[0063] Antisense constructs can be delivered, for example, as anexpression plasmid which, when transcribed in the cell, produces RNAwhich is complementary to at least a unique portion of the cellular mRNAwhich encodes a CCRG protein. Alternatively, the antisense construct cantake the form of an oligonucleotide probe generated ex vivo which, whenintroduced into a CCRG protein expressing cell, causes inhibition ofCCRG protein expression by hybridizing with an mRNA and/or genomicsequences coding for CCRG protein. Such oligonucleotide probes arepreferably modified oligonucleotides that are resistant to endogenousnucleases, e.g. exonucleases and/or endonucleases, and are thereforestable in vivo. Exemplary nucleic acid molecules for use as antisenseoligonucleotides are phosphoramidate, phosphothioate andmethylphosphonate analogs of DNA (see, e.g., U.S. Pat. Nos. 5,176,996;5,264,564; and 5,256,775). Additionally, general approaches toconstructing oligomers useful in antisense therapy have been reviewed,for example, by Van der Krol et al. (1988) Biotechniques 6:958-976; andStein et al. (1988) Cancer Res 48:2659-2668. With respect to antisenseDNA, oligodeoxyribonucleotides derived from the translation initiationsite, e.g., between the −10 and +10 regions of a CCRG protein encodingnucleotide sequence, are preferred.

[0064] Antisense approaches involve the design of oligonucleotides(either DNA or RNA) that are complementary to CCRG mRNA. The antisenseoligonucleotides will bind to CCRG mRNA transcripts and preventtranslation. Absolute complementarity, although preferred, is notrequired. The ability to hybridize will depend on both the degree ofcomplementarity and the length of the antisense nucleic acid. Generally,the longer the hybridizing nucleic acid, the more base mismatches withan RNA it may contain and still form a stable duplex (or triplex, as thecase may be). One skilled in the art can ascertain a tolerable degree ofmismatch by use of standard procedures to determine the melting point ofthe hybridized complex. Oligonucleotides that are complementary to the5′ end of the message, e.g., the 5′ untranslated sequence up to andincluding the AUG initiation codon, should work most efficiently atinhibiting translation. However, sequences complementary to the 3′untranslated sequences of mRNAs have been shown to be effective atinhibiting translation of mRNAs as well. (Wagner, R. (1994) Nature372:333). Therefore, oligonucleotides complementary to either the 5′ or3′ untranslated, non-coding regions of a CCRG gene could be used in anantisense approach to inhibit translation of endogenous CCRG mRNA.Oligonucleotides complementary to the 5′ untranslated region of the mRNAshould include the complement of the AUG start codon. Antisenseoligonucleotides complementary to mRNA coding regions are less efficientinhibitors of translation but could be used in accordance with theinvention. Whether designed to hybridize to the 5′, 3′ or coding regionof CCRG mRNA, antisense nucleic acids should be at least eighteennucleotides in length (e.g., 18, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80,90 nucleotides in length), and are preferably less that about 100nucleotides in length. An exemplary antisense oligonucleotide is anolignucleotide that is the complement of the olignucleotide shown hereinas SEQ ID NO:5. For example, an oligonucleotide having the nucleotidesequence of 5′ TCC TTG ATC TTC TTA TCC ATA ACG 3′ (SEQ ID NO:8) could beused as an antisense oligonucleotide.

[0065] Regardless of the choice of target sequence, it is preferred thatin vitro studies are first performed to quantitate the ability of theantisense oligonucleotide to inhibit gene expression. It is preferredthat these studies utilize controls that distinguish between antisensegene inhibition and nonspecific biological effects of oligonucleotides.It is also preferred that these studies compare levels of the target RNAor protein with that of an internal control RNA or protein.Additionally, it is envisioned that results obtained using the antisenseoligonucleotide are compared with those obtained using a controloligonucleotide. It is preferred that the control oligonucleotide is ofapproximately the same length as the test oligonucleotide and that thenucleotide sequence of the oligonucleotide differs from the antisensesequence no more than is necessary to prevent specific hybridization tothe target sequence.

[0066] Antisense oligonucleotides of the invention may comprise at leastone modified base moiety which is selected from the group including butnot limited to 5-fluorouracil, 5-bromouracil, 5-chlorouracil,5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine,5-(carboxyhydroxyethyl) uracil,5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouricil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-idimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Antisense oligonucleotides of the invention mayalso comprise at least one modified sugar moiety selected from the groupincluding but not limited to arabinose, 2-fluoroarabinose, xylulose, andhexose; and may additionally include at least one modified phosphatebackbone selected from the group consisting of a phosphorothioate, aphosphorodithioate, a phosphoramidothioate, a phosphoramidate, aphosphordiamidate, a methylphosphonate, an alkyl phosphotriester, and aformacetal or analog thereof.

[0067] In yet a further embodiment, the antisense oligonucleotide is analpha-anomeric oligonucleotide. An alpha-anomeric oligonucleotide formsspecific double-stranded hybrids with complementary RNA in which,contrary to the usual beta-units, the strands run parallel to each other(Gautier et al. (1987) Nucl. Acids Res. 15:6625-6641). Sucholigonucleotide can be a 2′-0-methylribonucleotide (Inoue et al. (1987)Nucl. Acids Res. 15:6131-6148), or a chimeric RNA-DNA analogue (Inoue etal. (1987) FEBS Lett. 215:327-330).

[0068] Oligonucleotides of the invention may be synthesized by standardmethods known in the art, e.g., by use of an automated DNA synthesizer(such as are commercially available from Biosearch, Applied Biosystems,etc.). As examples, phosphorothioate oligonucleotides may be synthesizedby the method of Stein et al. (1988) Nucl. Acids Res. 16:3209), andmethylphosphonate oligonucleotides can be prepared by use of controlledpore glass polymer supports (Sarin et al. (1988) Proc. Natl. Acad. Sci.U.S.A. 85:7448-7451).

[0069] The antisense molecules should be delivered into cells thatexpress CCRG in vivo. A number of methods have been developed fordelivering antisense DNA or RNA into cells. For instance, antisensemolecules can be introduced directly into the tissue site by suchstandard techniques as electroporation, liposome-mediated transfection,CaCl-mediated transfection, or the use of a gene gun. Alternatively,modified antisense molecules, designed to target the desired cells(e.g., antisense linked to peptides or antibodies that specifically bindreceptors or antigens expressed on the target cell surface) can be used.

[0070] Because it is often difficult to achieve intracellularconcentrations of the antisense sufficient to suppress translation onendogenous mRNAs, a preferred approach utilizes a recombinant DNAconstruct in which the antisense oligonucleotide is placed under thecontrol of a strong promoter (e.g., the CMV promoter). The use of such aconstruct to transform cells will result in the transcription ofsufficient amounts of single stranded RNAs that will form complementarybase pairs with the endogenous CCRG transcripts and thereby preventtranslation of CCRG mRNA.

[0071] Ribozyme molecules designed to catalytically cleave CCRG mRNAtranscripts can also be used to prevent translation of CCRG mRNA andexpression of CCRG protein (See, e.g., PCT Publication No. WO 90/11364,published Oct. 4, 1990; Sarver et al. (1990) Science 247:1222-1225 andU.S. Pat. No. 5,093,246). While ribozymes that cleave mRNA at sitespecific recognition sequences can be used to destroy CCRG mRNAs, theuse of hammerhead ribozymes is preferred. Hammerhead ribozymes cleavemRNAs at locations dictated by flanking regions that form complementarybase pairs with the target mRNA. The sole requirement is that the targetmRNA have the following sequence of two bases: 5′-UG-3′. Theconstruction and production of hammerhead ribozymes is well known in theart and is described more fully in Haseloff and Gerlach (1988) Nature334:585-591. There are several potential hammerhead ribozyme cleavagesites within the nucleotide sequence of the native CCRG gene. Preferablythe ribozyme is engineered so that the cleavage recognition site islocated near the 5′ end of CCRG mRNA; i.e., to increase efficiency andminimize the intracellular accumulation of non-functional mRNAtranscripts. Ribozymes within the invention can be delivered to a cellusing a vector as described below.

[0072] Endogenous CCRG gene expression can also be reduced byinactivating or “knocking out” the CCRG gene or its promoter usingtargeted homologous recombination. See, e.g, Kempin et al., Nature 389:802 (1997); Smithies et al. (1985) Nature 317:230-234; Thomas andCapecchi (1987) Cell 51:503-512; and Thompson et al. (1989) Cell5:313-321. For example, a mutant, non-functional CCRG gene variant (or acompletely unrelated DNA sequence) flanked by DNA homologous to theendogenous CCRG gene (either the coding regions or regulatory regions ofthe CCRG gene) can be used, with or without a selectable marker and/or anegative selectable marker, to transfect cells that express CCRG proteinin vivo.

[0073] Alternatively, endogenous CCRG gene expression might be reducedby targeting deoxyribonucleotide sequences complementary to theregulatory region of the CCRG gene (i.e., the CCRG promoter and/orenhancers) to form triple helical structures that prevent transcriptionof the CCRG gene in target cells. (See generally, Helene, C. (1991)Anticancer Drug Des. 6(6):569-84; Helene, C., et al. (1992) Ann. N.Y.Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15.

[0074] Nucleic acid molecules to be used in triple helix formation forthe inhibition of transcription are preferably single stranded andcomposed of deoxyribonucleotides. The base composition of theseoligonucleotides should promote triple helix formation via Hoogsteenbase pairing rules, which generally require sizable stretches of eitherpurines or pyrimidines to be present on one strand of a duplex.Nucleotide sequences may be pyrimidine-based, which will result in TATand CGC triplets across the three associated strands of the resultingtriple helix. The pyrimidine-rich molecules provide base complementarityto a purine-rich region of a single strand of the duplex in a parallelorientation to that strand. In addition, nucleic acid molecules may bechosen that are purine-rich, for example, containing a stretch of Gresidues. These molecules will form a triple helix with a DNA duplexthat is rich in GC pairs, in which the majority of the purine residuesare located on a single strand of the targeted duplex, resulting in CGCtriplets across the three strands in the triplex.

[0075] Alternatively, the potential sequences that can be targeted fortriple helix formation may be increased by creating a so called“switchback” nucleic acid molecule. Switchback molecules are synthesizedin an alternating 5′-3′, 3′-5′ manner, such that they base pair withfirst one strand of a duplex and then the other, eliminating thenecessity for a sizable stretch of either purines or pyrimidines to bepresent on one strand of a duplex.

[0076] Antisense RNA and DNA, ribozyme, and triple helix molecules ofthe invention may be prepared by any method known in the art for thesynthesis of DNA and RNA molecules. These include techniques forchemically synthesizing oligodeoxyribonucleotides andoligoribonucleotides well known in the art such as for example solidphase phosphoramide chemical synthesis. Alternatively, RNA molecules maybe generated by in vitro and in vivo transcription of DNA sequencesencoding the antisense RNA molecule. Such DNA sequences may beincorporated into a wide variety of vectors which incorporate suitableRNA polymerase promoters. Alternatively, antisense cDNA constructs thatsynthesize antisense RNA constitutively or inducibly, depending on thepromoter used, can be introduced stably into cell lines.

[0077] Moreover, various well-known modifications to nucleic acidmolecules may be introduced as a means of increasing intracellularstability and half-life. Possible modifications include but are notlimited to the addition of flanking sequences of ribonucleotides ordeoxyribonucleotides to the 5′ and/or 3′ ends of the molecule or the useof phosphorothioate or 2′ O-methyl rather than phosphodiesteraselinkages within the oligodeoxyribonucleotide backbone.

[0078] Probes and Primers

[0079] The invention also includes oligonucleotide probes (i.e.,isolated nucleic acid molecules conjugated with a detectable label orreporter molecule, e.g., a radioactive isotope, ligand, chemiluminescentagent, or enzyme); and oligonucleotide primers (i.e., isolated nucleicacid molecules that can be annealed to a complementary target DNA strandby nucleic acid hybridization to form a hybrid between the primer andthe target DNA strand, then extended along the target DNA strand by apolymerase, e.g., a DNA polymerase). Primer pairs can be used foramplification of a nucleic acid sequence, e.g., by the polymerase chainreaction (PCR) or other conventional nucleic-acid amplification methods.Probes and primers within the invention are generally 15 nucleotides ormore in length, preferably 20 nucleotides or more, more preferably 25nucleotides, and most preferably 30 nucleotides or more. Preferredprobes and primers are those that hybridize to the native CCRG genesequence under high stringency conditions, and those that hybridize CCRGgene homologs under at least moderate stringency conditions. Preferably,probes and primers according to the present invention have completesequence identity with the native CCRG gene sequence, although probesdiffering from the native CCRG gene sequence and that retain the abilityto hybridize to native CCRG gene sequences under stringent conditionsmay be designed by conventional methods. Primers and probes based on thenative CCRG gene sequences disclosed herein can be used to confirm (and,if necessary, to correct) the disclosed native CCRG gene sequence byconventional methods, e.g., by re-cloning and sequencing a native CCRGcDNA. Particularly preferred primer pairs for use in the invention areshown as SEQ ID NO:2 and SEQ ID NO:3; and SEQ ID NO:9 and SEQ ID NO: 10,both pairs having been shown to selectively amplify CCRG gene sequences,the former amplifying a 455 bp product, the latter amplifying a 267 bpproduct including the signal sequence and most of the CDS of the CCRGgene. A particularly preferred oligonucleotide probe for use in theinvention is shown as SEQ ID NO:4.

[0080] CCRG Proteins

[0081] In other aspects, the present invention utilizes a purified CCRGprotein encoded by a nucleic acid of the invention. Preferred forms ofCCRG protein include a purified native CCRG protein that has either thededuced amino acid sequence shown in FIG. 7 (SEQ ID NO:7) or the aminoacid sequence shown in FIG. 8. Variants of native CCRG proteins such asfragments, analogs and derivatives of native CCRG are also within theinvention. Such variants include, e.g., a polypeptide encoded by anaturally occurring allelic variant of native CCRG gene, a polypeptideencoded by a homolog of native CCRG gene, and a polypeptide encoded by anon-naturally occurring variant of native CCRG gene.

[0082] CCRG protein variants have a peptide sequence that differs from anative CCRG protein in one or more amino acids. The peptide sequence ofsuch variants can feature a deletion, addition, or substitution of oneor more amino acids of a native CCRG polypeptide. Amino acid insertionsare preferably of about 1 to 4 contiguous amino acids, and deletions arepreferably of about 1 to 10 contiguous amino acids. In someapplications, variant CCRG proteins substantially maintain a native CCRGprotein functional activity (e.g., association with cancer). For otherapplications, variant CCRG proteins lack or feature a significantreduction in a CCRG protein functional activity. Where it is desired toretain a functional activity of native CCRG protein, preferred CCRGprotein variants can be made by expressing nucleic acid molecules withinthe invention that feature silent or conservative changes. Variant CCRGproteins with substantial changes in functional activity can be made byexpressing nucleic acid molecules within the invention that feature lessthan conservative changes.

[0083] CCRG protein fragments corresponding to one or more particularmotifs and/or domains or to arbitrary sizes, for example, at least 5,10, 25, 30, 40, 50, 50, 70, 75, 80, 90, and 100 amino acids in lengthare within the scope of the present invention. Isolated peptidylportions of CCRG proteins can be obtained by screening peptidesrecombinantly produced from the corresponding fragment of the nucleicacid encoding such peptides. In addition, fragments can be chemicallysynthesized using techniques known in the art such as conventionalMerrifield solid phase f-Moc or t-Boc chemistry. For example, a CCRGprotein of the present invention may be arbitrarily divided intofragments of desired length with no overlap of the fragments, orpreferably divided into overlapping fragments of a desired length. Thefragments can be produced (recombinantly or by chemical synthesis) andtested to identify those peptidyl fragments which can function as eitheragonists or antagonists of native CCRG protein.

[0084] Another aspect of the present invention concerns recombinantforms of the CCRG proteins. Recombinant polypeptides preferred by thepresent invention, in addition to native CCRG protein, are encoded by anucleic acid that has at least 85% sequence identity (e.g., 85, 86, 87,88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100%) with the nucleicacid sequence of SEQ ID NO:6. In a preferred embodiment, variant CCRGproteins have one or more functional activities of native CCRG protein.

[0085] CCRG protein variants can be generated through various techniquesknown in the art. For example, CCRG protein variants can be made bymutagenesis, such as by introducing discrete point mutation(s), or bytruncation. Mutation can give rise to a CCRG protein variant havingsubstantially the same, or merely a subset of the functional activity ofa native CCRG protein. Alternatively, antagonistic forms of the proteincan be generated which are able to inhibit the function of the naturallyoccurring form of the protein, such as by competitively binding toanother molecule that interacts with a CCRG protein. In addition,agonistic forms of the protein may be generated that constitutivelyexpress one or more CCRG functional activities. Other variants of CCRGthat can be generated include those that are resistant to proteolyticcleavage, as for example, due to mutations which alter protease targetsequences. Whether a change in the amino acid sequence of a peptideresults in a CCRG protein variant having one or more functionalactivities of native CCRG protein can be readily determined by testingthe variant for a native CCRG protein functional activity.

[0086] As another example, CCRG protein variants can be generated from adegenerate oligonucleotide sequence. Chemical synthesis of a degenerategene sequence can be carried out in an automatic DNA synthesizer, andthe synthetic genes then ligated into an appropriate expression vector.The purpose of a degenerate set of genes is to provide, in one mixture,all of the sequences encoding the desired set of potential CCRG proteinsequences. The synthesis of degenerate oligonucleotides is well known inthe art (see for example, Narang, S A (1983) Tetrahedron 39:3; Itakuraet al. (1981) Recombinant DNA, Proc 3rd Cleveland Sympos.Macromolecules, ed. A G Walton, Amsterdam: Elsevier pp 273-289; Itakuraet al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science198:1056; Ike et al. (1983) Nucleic Acid Res. 11:477. Such techniqueshave been employed in the directed evolution of other proteins (see, forexample, Scott et al. (1990) Science 249:386-390; Roberts et al. (1992)Proc. Natl. Acad. Sci. USA 89:2429-2433; Devlin et al. (1990) Science249: 404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. USA 87:6378-6382; as well as U.S. Pat. Nos. 5,223,409; 5,198,346; and5,096,815).

[0087] Similarly, a library of coding sequence fragments can be providedfor a CCRG gene clone in order to generate a variegated population ofCCRG protein fragments for screening and subsequent selection offragments having one or more native CCRG functional activities. Avariety of techniques are known in the art for generating suchlibraries, including chemical synthesis. In one embodiment, a library ofcoding sequence fragments can be generated by (i) treating adouble-stranded PCR fragment of a CCRG gene coding sequence with anuclease under conditions wherein nicking occurs only about once permolecule; (ii) denaturing the double-stranded DNA; (iii) renaturing theDNA to form double-stranded DNA which can include sense/antisense pairsfrom different nicked products; (iv) removing single-stranded portionsfrom reformed duplexes by treatment with S1 nuclease; and (v) ligatingthe resulting fragment library into an expression vector. By thisexemplary method, an expression library can be derived which codes forN-terminal, C-terminal and internal fragments of various sizes.

[0088] A wide range of techniques are known in the art for screeninggene products of combinatorial libraries made by point mutations ortruncation, and for screening cDNA libraries for gene products having acertain property. Such techniques will be generally adaptable for rapidscreening of the gene libraries generated by the combinatorialmutagenesis of CCRG gene variants. The most widely used techniques forscreening large gene libraries typically comprise cloning the genelibrary into replicable expression vectors, transforming appropriatecells with the resulting library of vectors, and expressing thecombinatorial genes under conditions in which detection of a desiredactivity facilitates relatively easy isolation of the vector encodingthe gene whose product was detected.

[0089] Combinatorial mutagenesis has a potential to generate very largelibraries of mutant proteins, e.g., in the order of 10²⁶ molecules.Combinatorial libraries of this size may be technically challenging toscreen even with high throughput screening assays. To overcome thisproblem, techniques such as recursive ensemble mutagenesis (REM) thatallow one to avoid the very high proportion of non-functional proteinsin a random library and simply enhance the frequency of functionalproteins (thus decreasing the complexity required to achieve a usefulsampling of sequence space) can be used. REM is an algorithm whichenhances the frequency of functional mutants in a library when anappropriate selection or screening method is employed (Arkin and Yourvan(1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Yourvan et al. (1992)Parallel Problem Solving from Nature, 2., In Maenner and Manderick,eds., Elsevier Publishing Co., Amsterdam, pp. 401-410; Delgrave et al.(1993) Protein Engineering 6(3):327-331).

[0090] The invention also provides for reduction of CCRG proteins togenerate mimetics, e.g. peptide or non-peptide agents, that are able todisrupt binding of a CCRG protein to other proteins or molecules withwhich a native CCRG protein interacts. Thus, the mutagenic techniquesdescribed can also be used to map which determinants of a CCRG proteinparticipate in protein-protein interactions involved in, for example,binding of a CCRG protein to other proteins which may function upstream(including both activators and repressors of its activity) of the CCRGprotein or to proteins or nucleic acids which may function downstream ofthe CCRG protein, and whether such molecules are positively ornegatively regulated by the CCRG protein. To illustrate, the criticalresidues of a CCRG protein which are involved in molecular recognitionof, for example, a molecule having a moiety that binds the CCRG proteincan be determined and used to generate CCRG protein-derivedpeptidomimetics which competitively inhibit binding of CCRG protein withthat moiety. By employing, for example, scanning mutagenesis to map theamino acid residues of a CCRG protein that are involved in binding otherproteins, peptidomimetic compounds can be generated which mimic thoseresidues of native CCRG protein. Such mimetics may then be used tointerfere with the normal function of a CCRG protein. For instance,non-hydrolyzable peptide analogs of such residues can be generated usingbenzodiazepine (e.g., see Freidinger et al. in Peptides: Chemistry andBiology, G. R. Marshall ed., ESCOM Publisher: Leiden, Netherlands,1988), azepine (e.g., see Huffman et al. in Peptides: Chemistry andBiology, G. R. Marshall ed., ESCOM Publisher: Leiden, Netherlands,1988), substituted gamma lactam rings (Garvey et al. in Peptides:Chemistry and Biology, G. R. Marshall ed., ESCOM Publisher: Leiden,Netherlands, 1988), keto-methylene pseudopeptides (Ewenson et al. (1986)J. Med. Chem. 29:295; and Ewenson et al. in Peptides: Structure andFunction (Proceedings of the 9th American Peptide Symposium) PierceChemical Co. Rockland, Ill. 1985), beta-turn dipeptide cores (Nagai etal. (1985) Tetrahedron Lett 26:647; and Sato et al. (1986) J. Chem. Soc.Perkin. Trans. 1:1231), and b-aminoalcohols (Gordon et al. (1985)Biochem. Biophys. Res. Commun. 126:419; and Dann et al. (1986) Biochem.Biophys. Res. Commun. 134:71). CCRG proteins may also be chemicallymodified to create CCRG derivatives by forming covalent or aggregateconjugates with other chemical moieties, such as glycosyl groups,lipids, phosphate, acetyl groups and the like. Covalent derivatives ofCCRG protein can be prepared by linking the chemical moieties tofunctional groups on amino acid side chains of the protein or at theN-terminus or at the C-terminus of the polypeptide.

[0091] The present invention further pertains to methods of producingthe subject CCRG proteins. For example, a host cell transfected with anucleic acid vector directing expression of a nucleotide sequenceencoding the subject polypeptides can be cultured under appropriateconditions to allow expression of the peptide to occur. The cells may beharvested, lysed, and the protein isolated. A recombinant CCRG proteincan be isolated from host cells using techniques known in the art forpurifying proteins including ion-exchange chromatography, gel filtrationchromatography, ultrafiltration, electrophoresis, and immunoaffinitypurification with antibodies specific for such protein.

[0092] For example, after CCRG protein has been expressed in a cell, itcan be isolated using any immuno-affinity chromatography. For instance,an anti-CCRG antibody (e.g., produced as described below) can beimmobilized on a column chromatography matrix, and the matrix can beused for immuno-affinity chromatography to purify CCRG protein from celllysates by standard methods (see, e.g., Ausubel et al., supra). Afterimmuno-affinity chromatography, CCRG protein can be further purified byother standard techniques, e.g., high performance liquid chromatography(see, e.g., Fisher, Laboratory Techniques In Biochemistry And MolecularBiology, Work and Burdon, eds., Elsevier, 1980). In another embodiment,CCRG protein is expressed as a fusion protein containing an affinity tag(e.g., GST) that facilitates its purification.

[0093] CCRG-Protein Specific Antibodies

[0094] CCRG proteins (or immunogenic fragments or analogs thereof) canbe used to raise antibodies useful in the invention. Such proteins canbe produced by recombinant techniques or synthesized as described above.In general, CCRG proteins can be coupled to a carrier protein, such asKLH, as described in Ausubel et al., supra, mixed with an adjuvant, andinjected into a host animal. Antibodies produced in that animal can thenbe purified by peptide antigen affinity chromatography. In particular,various host animals can be immunized by injection with a CCRG proteinor an antigenic fragment thereof. Commonly employed host animals includerabbits, mice, guinea pigs, and rats. Various adjuvants that can be usedto increase the immunological response depend on the host species andinclude Freund's adjuvant (complete and incomplete), mineral gels suchas aluminum hydroxide, surface active substances such as lysolecithin,pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpethemocyanin, and dinitrophenol. Other potentially useful adjuvantsinclude BCG (bacille Calmette-Guerin) and Corynebacterium parvum.

[0095] Polyclonal antibodies are heterogeneous populations of antibodymolecules that are contained in the sera of the immunized animals.Antibodies within the invention therefore include polyclonal antibodiesand, in addition, monoclonal antibodies, single chain antibodies, Fabfragments, F(ab′)₂ fragments, and molecules produced using a Fabexpression library. Monoclonal antibodies, which are homogeneouspopulations of antibodies to a particular antigen, can be prepared usingthe CCRG proteins described above and standard hybridoma technology(see, for example, Kohler et al., Nature 256:495, 1975; Kohler et al.,Eur. J. Immunol. 6:511, 1976; Kohler et al., Eur. J. Immunol. 6:292,1976; Hammerling et al., “Monoclonal Antibodies and T Cell Hybridomas,”Elsevier, N.Y., 1981; Ausubel et al., supra). In particular, monoclonalantibodies can be obtained by any technique that provides for theproduction of antibody molecules by continuous cell lines in culturesuch as described in Kohler et al., Nature 256:495, 1975, and U.S. Pat.No. 4,376,110; the human B-cell hybridoma technique (Kosbor et al.,Immunology Today 4:72, 1983; Cole et al., Proc. Natl. Acad. Sci. USA80:2026, 1983), and the EBV-hybridoma technique (Cole et al.,“Monoclonal Antibodies and Cancer Therapy,” Alan R. Liss, Inc., pp.77-96, 1983). Such antibodies can be of any immunoglobulin classincluding IgG, IgM, IgE, IgA, IgD and any subclass thereof. A hybridomaproducing a mAb of the invention may be cultivated in vitro or in vivo.The ability to produce high titers of mAbs in vivo makes this aparticularly useful method of production.

[0096] Human or humanoid antibodies that specifically bind a CCRGprotein can also be produced using known methods. For example, humanantibodies against CCRG protein can be, made by adapting knowntechniques for producing human antibodies in animals such as mice. See,e.g., Fishwild, D. M. et al., Nature Biotechnology 14 (1996): 845-851;Heijnen, I. et al., Journal of Clinical Investigation 97 (1996):331-338; Lonberg, N. et al., Nature 368 (1994): 856-859; Morrison, S.L., Nature 368 (1994): 812-813; Neuberger, M., Nature Biotechnology 14(1996): 826; and U.S. Pat. Nos. 5,545,806; 5,569,825; 5,877,397;5,939,598; 6,075,181; 6,091,001; 6,114,598; and 6,130,314. Humanoidantibodies against CCRG can be made from non-human antibodies byadapting known methods such as those described in U.S. Pat. Nos.5,530,101; 5,585,089; 5,693,761; and 5,693,762.

[0097] Once produced, polyclonal or monoclonal antibodies can be testedfor specific CCRG recognition by Western blot or immunoprecipitationanalysis by standard methods, for example, as described in Ausubel etal., supra. Antibodies that specifically recognize and bind to CCRG areuseful in the invention. For example, such antibodies can be used in animmunoassay to monitor the level of CCRG produced by a mammal (e.g., todetermine the amount or subcellular location of CCRG).

[0098] Preferably, CCRG protein selective antibodies of the inventionare produced using fragments of the CCRG protein that lie outside highlyconserved regions and appear likely to be antigenic, by criteria such ashigh frequency of charged residues. Cross-reactive anti-CCRG proteinantibodies are produced using a fragment of a CCRG protein that isconserved among members of this family of proteins. In one specificexample, such fragments are generated by standard techniques of PCR, andare then cloned into the pGEX expression vector (Ausubel et al., supra).Fusion proteins are expressed in E. coli and purified using aglutathione agarose affinity matrix as described in Ausubel, et al.,supra.

[0099] In some cases it may be desirable to minimize the potentialproblems of low affinity or specificity of antisera. In suchcircumstances, two or three fusions can be generated for each protein,and each fusion can be injected into at least two rabbits. Antisera canbe raised by injections in a series, preferably including at least threebooster injections. Antiserum is also checked for its ability toimmunoprecipitate recombinant CCRG proteins or control proteins, such asglucocorticoid receptor, CAT, or luciferase.

[0100] The antibodies of the invention can be used, for example, in thedetection of CCRG protein in a biological sample. Antibodies also can beused in a screening assay to measure the effect of a candidate compoundon expression or localization of CCRG protein. Additionally, suchantibodies can be used to interfere with the interaction of CCRG proteinand other molecules that bind CCRG protein.

[0101] Techniques described for the production of single chainantibodies (e.g., U.S. Pat. Nos. 4,946,778, 4,946,778, and 4,704,692)can be adapted to produce single chain antibodies against a CCRGprotein, or a fragment thereof. Single chain antibodies are formed bylinking the heavy and light chain fragments of the Fv region via anamino acid bridge, resulting in a single chain polypeptide.

[0102] Antibody fragments that recognize and bind to specific epitopescan be generated by known techniques. For example, such fragmentsinclude but are not limited to F(ab′)₂ fragments that can be produced bypepsin digestion of the antibody molecule, and Fab fragments that can begenerated by reducing the disulfide bridges of F(ab′)₂ fragments.Alternatively, Fab expression libraries can be constructed (Huse et al.,Science 246:1275, 1989) to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity.

[0103] Proteins that Associate with CCRG

[0104] The invention also features methods for identifying polypeptidesthat can associate with a CCRG protein. Any method that is suitable fordetecting protein-protein interactions can be employed to detectpolypeptides that associate with a CCRG protein . Among the traditionalmethods that can be employed are co-immunoprecipitation, crosslinking,and co-purification through gradients or chromatographic columns of celllysates or proteins obtained from cell lysates and the use of a CCRGprotein to identify proteins in the lysate that interact with a CCRGprotein. For these assays, the CCRG protein can be a full length CCRGprotein, a particular domain of CCRG protein, or some other suitableCCRG protein. Once isolated, such an interacting protein can beidentified and cloned and then used, in conjunction with standardtechniques, to alter the activity of the CCRG protein with which itinteracts. For example, at least a portion of the amino acid sequence ofa protein that interacts with CCRG protein can be ascertained usingtechniques well known to those of skill in the art, such as via theEdman degradation technique. The amino acid sequence obtained can beused as a guide for the generation of oligonucleotide mixtures that canbe used to screen for gene sequences encoding the interacting protein.Screening can be accomplished, for example, by standard hybridization orPCR techniques. Techniques for the generation of oligonucleotidemixtures and the screening are well-known (Ausubel et al., supra; andInnis et al., supra).

[0105] Additionally, methods can be employed that result directly in theidentification of genes that encode proteins that interact with a CCRGprotein. These methods include, for example, screening expressionlibraries, in a manner similar to the well known technique of antibodyprobing of lgt11 libraries, using a labeled CCRG protein or a CCRGfusion protein, for example, a CCRG protein or domain fused to a markersuch as an enzyme, fluorescent dye, a luminescent protein, or to an IgFcdomain.

[0106] There are also methods available that can detect protein-proteininteraction in vivo. For example, as described herein the two-hybridsystem can be used to detect such interactions in vivo. See, e.g., Chienet al., Proc. Natl. Acad. Sci. USA 88:9578, 1991. Briefly, as oneexample of utilizing such a system, plasmids are constructed that encodetwo hybrid proteins: one plasmid includes a nucleotide sequence encodingthe DNA-binding domain of a transcription activator protein fused to anucleotide sequence encoding a native CCRG protein, a CCRG proteinvariant, or a CCRG fusion protein, and the other plasmid includes anucleotide sequence encoding the transcription activator protein'sactivation domain fused to a cDNA encoding an unknown protein which hasbeen recombined into this plasmid as part of a cDNA library. TheDNA-binding domain fusion plasmid and the cDNA library are transformedinto a strain of the yeast Saccharomyces cerevisiae that contains areporter gene (e.g., HBS or lacZ) whose regulatory region contains thetranscription activator's binding site. Either hybrid protein alonecannot activate transcription of the reporter gene: the DNA-bindingdomain hybrid cannot because it does not provide activation function,and the activation domain hybrid cannot because it cannot localize tothe activator's binding sites. Interaction of the two hybrid proteinsreconstitutes the functional activator protein and results in expressionof the reporter gene, which is detected by an assay for the reportergene product.

[0107] The two-hybrid system or related methodology can be used toscreen activation domain libraries for proteins that interact with the“bait” gene product. By way of example, and not by way of limitation, aCCRG protein may be used as the bait. Total genomic or cDNA sequencesare fused to the DNA encoding an activation domain. This library and aplasmid encoding a hybrid of a bait CCRG protein fused to theDNA-binding domain are co-transformed into a yeast reporter strain, andthe resulting transformants are screened for those that express thereporter gene. For example, a bait CCRG gene sequence, such as thatencoding CCRG protein or a domain of CCRG protein can be cloned into avector such that it is translationally fused to the DNA encoding theDNA-binding domain of the GAL4 protein. These colonies are purified andthe library plasmids responsible for reporter gene expression areisolated. DNA sequencing is then used to identify the proteins encodedby the library plasmids.

[0108] A cDNA library of the cell line from which proteins that interactwith a bait CCRG protein are to be detected can be made using methodsroutinely practiced in the art. According to the particular systemdescribed herein, for example, the cDNA fragments can be inserted into avector such that they are translationally fused to the transcriptionalactivation domain of GAL4. This library can be co-transformed along withthe CCRG-GAL4 encoding fusion plasmid into a yeast strain which containsa lacZ gene driven by a promoter which contains GAL4 activationsequence. A cDNA encoded protein, fused to GAL4 transcriptionalactivation domain, that interacts with bait CCRG protein willreconstitute an active GAL4 protein and thereby drive expression of theHIS3 gene. Colonies that express HIS3 can then be purified from thesestrains and used to produce and isolate bait CCRG protein-interactingproteins using techniques routinely practiced in the art.

[0109] Detection of CCRG Polynucleotides and Proteins

[0110] The invention encompasses methods for detecting the presence of aCCRG protein or a CCRG nucleic acid in a biological sample as well asmethods for measuring the level of a CCRG protein or a CCRG nucleic acidin a biological sample. Such methods are useful for diagnosing cancerassociated with CCRG expression (e.g., colon cancer).

[0111] An exemplary method for detecting the presence or absence of CCRGin a biological sample involves obtaining a biological sample from atest subject (e.g., a human patient), contacting the biological samplewith a compound or an agent capable of detecting a CCRG protein or anucleic acid encoding a CCRG protein (e.g., mRNA or genomic DNA), andanalyzing binding of the compound or agent to the sample after washing.Those samples having specifically bound compound or agent are those thatexpress a CCRG protein or a nucleic acid encoding a CCRG protein.

[0112] A preferred agent for detecting a nucleic acid encoding a CCRGprotein is a labeled nucleic acid probe capable of hybridizing (e.g.,under stringent hybridization conditions) to the nucleic acid encodingthe CCRG protein. The nucleic acid probe can be, for example, all or aportion of the native CCRG gene itself (e.g., a nucleic acid moleculehaving the sequence of SEQ ID NO:6) or all or a portion of a complementof the native CCRG gene. Similarly, the probe can also be all or aportion of a CCRG gene variant, or all or a portion of a complement of aCCRG gene variant. For instance, oligonucleotides at least 15, 30, 50,75, 100, 125, 150, 175, 200, 225, or 250 nucleotides in length thatspecifically hybridize under stringent conditions to the native CCRGgene or a complement of the native CCRG gene can be used as probeswithin the invention. An exemplary probe has the nucleotide sequence ofSEQ ID NO:4. A preferred agent for detecting a CCRG protein is anantibody capable of binding to a CCRG protein, preferably an antibodywith a detectable label. Such antibodies can be polyclonal, or morepreferably, monoclonal. An, intact antibody, or a fragment thereof(e.g., Fab or F(ab′)₂) can be used.

[0113] Detection methods of the invention can be used to detect an mRNAencoding a CCRG protein, a genomic DNA encoding a CCRG protein, or aCCRG protein in a biological sample in vitro as well as in vivo. Forexample, in vitro techniques for detection of mRNAs encoding a CCRGprotein include Northern hybridizations and in situ hybridizations. Invitro techniques for detection of a CCRG protein include enzyme linkedimmunosorbent assays (ELISAs), Western blots, immunoprecipitations andimmunofluorescence. In vitro techniques for detection of genomic DNAencoding a CCRG protein include Southern hybridizations. In vivotechniques for detection of a CCRG protein include introducing alabelled anti-CCRG antibody into a biological sample or test subject.For example, the antibody can be labeled with a radioactive marker whosepresence and location in a biological sample or test subject can bedetected by standard imaging techniques.

[0114] Screening for Compounds that Interact with CCRG Protein

[0115] The invention also encompasses methods for identifying compoundsthat specifically bind to a CCRG protein. One such method involves thesteps of providing immobilized purified CCRG protein and at least onetest compound; contacting the immobilized protein with the testcompound; washing away substances not bound to the immobilized protein;and detecting whether or not the test compound is bound to theimmobilized protein. Those compounds remaining bound to the immobilizedprotein are those that specifically interact with the CCRG protein.

EXAMPLES

[0116] The present invention is further illustrated by the followingspecific examples. The examples are provided for illustration only andare not to be construed as limiting the scope or content of theinvention in any way.

Example 1 Identification of Unigene Preferentially Expressed in ColonTumors

[0117] Unigene Hs.105470 was identified as being present in the colontumor tissues, but not in the normal tissue. Total RNA was isolated froma matched set of normal and colon tumors and reverse transcribed usingrandom hexamers and Superscript reverse transcriptase (LifeTechnologies). One-fortieth of the resulting cDNAs was PCR-amplifiedusing the PCR primers described herein as SEQ ID Nos: 2 and 3. Theconditions for the PCR included 1) initial denaturation at 94° C. for 7mins; 2) denaturation at 94° C. for 1 min, annealing at 62° C. for 2mins. And extension at 72° C. for 3 mins, for 35 cycles with a finalextension at 72° C. for 10 mins. Referring to FIG. 1, Unigene Hs 105470showed a RT-dependant PCR product of 455 bp. This product was not seenin the control RT-minus reaction, nor in the peripheral blood lymphocyteDNA. A product of higher molecular weight was detected in the genomicDNA sample, indicating that the RT-PCR primers reside in two differentexons. UniGene #105470 has five ESTs assigned to the cluster. Thesequence of the longest EST (Genbank Accession No. AA524300) in thisUniGene is 577 bp in length (which was the maximum size extendable as acontig) and is shown herein-as SEQ ID NO:1. The RT-PCR primers used toidentify a gene encompassing this EST, termed CCRG, is shown in SEQ IDNO:2 (sense) and SEQ ID NO:3 (antisense).

Example 2 Cloning of the CCRG Gene

[0118] A mixture of cDNA libraries from different human tissues(activated T cells, adrenal gland, fetal brain, pituitary glands, spinalcord, small intestine, skeletal muscle, uterus, stomach and trachea) wasscreened using the oligonucleotide of SEQ ID NO:1 as a probe. Using EdgeSequence as a cloning system, several independent clones were obtained.Clones were verified to contain the CCRG gene by RT-PCR using SEQ IDNos: 2 and 3 as PCR primers. A predominate 744 bp clone was sequenced.This clone contained the original sequence of SEQ ID NO:1 and 220 bp ofadditional sequences. A northern blot analysis of total RNA from colontumor and normal tissue with this 744 bp probe detected a mRNA ofapproximately 0.8 kb in the tumor, but not in the normal tissue. Thenucleotide sequence of this partial cDNA encoding a portion of the CCRGgene, termed C4, is shown as SEQ ID NO:5.

[0119] Unamplified human cDNA libraries from activated T cells, adrenalgland, fetal HUVEC, lymphoma, skeletal muscle, small intestine, stomach,Jurkat cells and uterus were screened using an RT-PCR generated productcorresponding to SEQ ID NO:5. Edge (Edge Biosystems) unamplified cDNAlibraries were prepared from stringently size-selected cDNA. cDNAinsertion is performed utilizing a directional adaptor strategy whichpreserved representation. The vector used to prepare the library (pEAK8)contained the EBV latent origin of replication and an EBNA-1transcription unit for plasmid replication in non-rodent cells and theSV40 origin for plasmid replication in cells expressing SV40 large Tantigen. Insert expression was under the control of a modified versionof the strong cell-type independent EF-1á promoter. The cDNA librariesfrom these organs were made in a mammalian expression vector, pEAK8,Edge Biosystems Inc. (Gaithersburg, Md.). The library was screened byhigh throughput screening with an internal PCR probe (SEQ ID NO:5). ThepEAK8 vector is a mammalian expression vector containing SV40 and EBVorigin of replication, EF-1 alpha promoter, a poly linker site forcloning, a poly A and splice sites at the 3′ end of the insert. Thevector also contains Tk promoter driven puromycin gene for selection inmammalian cells, an EBNA-1 antigen tag and an ampicilin resistance genefor selection in bacteria. Forty-eight independent clones were isolatedfrom this screening. All the clones were confirmed by PCR for thepresence of internal sequence for the nucleic acid of SEQ ID NO:5.Plasmid DNAs were isolated from these 48 clones, PCR amplified with CCRGgene-specific primers, and restriction digested with EcoR1 and Not1which cuts in the poly linker site of the pEAK8 vector, thus releasingthe insert. The products were separated on 25 agarose gels and theproducts visualized by ethidium bromide staining. The products wereconfirmed to contain the nucleic acid of SEQ ID NO:5 by hybridization toan internal oligonucleotide probe.

[0120] Four independent clones strongly hybridizing to the probe wereselected for sequencing. Sequencing was done using pEAK8 forward ( 5′GGA TCT TTG GTT CAT TCT CAA 3′) and pEAK8 reverse ( 5′ CTG GAT GCA GGCTAC TCT AG 3′). Both of these primers are present outside the cloningsites in the poly linker region of the pEAK8 vector. All the four clonescontained additional sequences from the nucleic acid of SEQ ID NO:5. Oneof the clone, termed G6, contained a complete open reading frame with asignal peptide sequence. The g6 clone had an insert size ofapproximately 800 bp and detected a mRNA of about 750 bp in a Northernblot of colon tumor-derived RNA, but not in the normal colon mRNA.RT-PCR primers encompassing the entire G6 clone also detected a specificproduct in the colon tumor derived mRNAs, but not in the correspondingnormal colon derived mRNAs. The G6 clone also contained apolyadenylation site and a poly A tail. The gene thus identified wastermed CCRG for Colon Carcinoma Related Gene.

Example 3 Characterization of CCRG Protein

[0121] The CCRG gene has a signal peptide sequence M G P S S C L L L I LI P L L Q L I N P G S T Q C S L D S V upstream of the initiation Metcodon. This consensus signal peptide sequence is found in secretedgrowth factors and cytokines. Using the SignaP prediction program at theSwiss Expasy site (http://www.expasy.ch/), the precise position for thesignal peptidase cleavage of the CCRG gene is predicted to occur atGST-QC leaving a leader sequence of 7 amino acids before the Met codonof the mature CCRG protein. The PSORT program at the Expasy site whichpredicts the cell localization predicted that the CCRG gene is likelylocalized outside the cell. The mature protein has a theoretical MW of8.62 kDa and pI of 8.05.

[0122] A CCRG gene, which was cloned in a Simian Virus 40 (SV40)expression vector was transfected into a recipient cells (e.g., COS-7cells). The transfection resulted in expression of a CCRG protein in thesupernatant of the media in which the transfected cells were cultured.When tested on colon carcinoma derived cells, the cell free supernatantstimulated DNA synthesis in the cells as monitored by ³H-thymidineincorporation. These results are consistent with the CCRG gene encodinga secreted product which has a growth stimulating property.

[0123] Nucleotide and amino acid homology searches at the NCBI(http://www.ncbi.nlm.nih.gov/BLAST/) revealed no significant homology toknown proteins. Analysis of motifs and patterns at the ProCyte database(http://www.expasy.ch/tools/scnpsite.html) of the Expasy site showedthat the CCRG gene product likely encodes phosphorylation sites,myristylation sites, and glycosylation sites. In addition, a prokaryoticlipoprotein binding site and a prenylation site were identified as beingencoded by the CCRG gene.

[0124] The C-terminus of the CCRG gene is cysteine rich with a motif1CX11; 2CX8; 3CX1;4CX3; 5CX10; 6CX1; 7CX1; 8CX9; 9C10C. This motif isalso found in ultra high sulphur matrix protein in hair keratin,metallothionein and cation transporters. Three dimensional structurehomology searches against the 3D database of PDB at the NCBI showed somestructural homology to cartilage oligomeric matrix precursor and LDL-rrelated proteins. Secondary structure prediction program at the Expasysite predicted mostly a mixture of alpha helices, beta strands, andcoils.

Example 4 Lack of CCRG Gene Expression in Non Colon-Derived Solid Tumors

[0125] In order to evaluate the specificity of expression of CCRG (C4)gene in colon tissues, a panel of cDNAs from diverse normal humantissues was obtained from Clontech Laboratories (Palo Alto, Calif.).These cDNAs were PCR amplified using the sense and the antisense primersdescribed respectively as SEQ ID NOs: 2 and 3. RT-PCR analysis of thesecDNAs was performed as described herein. As shown in FIG. 2, the C4(portion of the CCRG) gene was detected in small intestine, but not inheart, brain, placenta, liver, kidney, skeletal muscle, spleen, thymus,testis, peripheral blood lymphocytes, lymph nodes, bone marrow, fetalliver, tonsils, breast, colon, lung, ovary, pancreas and prostate. Thesamples were simultaneously analyzed for actin expression as an internalcontrol.

[0126] To further evaluate the specificity of CCRG expression to colontumors, random primed cDNAs from five other solid tumors (breast, lung,ovary, prostate and pancreas) were generated using the RT methoddescribed herein. These cDNAs were PCR amplified using the sense and theantisense primers described as SEQ ID NOs: 2 and 3. As shown in FIG. 3,the amplified products were not detected in any of these tumor or normaltumor derived cDNAs. The samples were simultaneously analyzed for actinexpression as an internal control.

Example 5 Colon Tumor Specific Upregulation of the CCRG Gene

[0127] Further evidence that the CCRG gene expression is colon tumorspecific was obtained using cDNAs derived from five different matchednormal and tumor colon tissues. Random primed cDNAs were generated fromthe total RNAs from these tissues, and the cDNAs were PCR amplifiedusing the sense and the antisense primers described in SEQ ID NOs: 2 and3. As shown in FIG. 4, the CCRG (C4) gene was upregulated in each of thecolon tumor tissues, but not in the matched normal tissues.

Example 6 Detection of the CCRG Gene by Hybridization Using anOligonucleotide Probe

[0128] The CCRG gene was detected using of an oligonucleotide probelabeled with ³²P-labeled dNTP. An oligonucleotide corresponding to SEQID NO:4 was synthesized, and then end-labeled with gamma ³²P-labeleddATP using polynucleotide kinase. RT-PCR products were generated in thepresence or absence of RT from a matched set of tumor and normal colon,transferred to a nitrocellulose membrane, and hybridized to the³²P-labeled oligonucleotide probe. As shown in FIG. 5, this probehybridized to a 455 bp product in the tumor derived cDNA, but not in thenormal tissue cDNA. The probe also detected a band in a genomic DNA (ca.1.5 kbp) sample obtained from peripheral blood lymphocytes.

Example 7 Diagnostic Process

[0129] Evaluation of CCRG gene expression is specifically envisioned asa method for diagnosing cancer. In this method, tissue to be examined isisolated from a patient (e.g., cells from polyps, adenomas carcinomas,etc. are obtained during routine colonoscopy). Total RNA obtained fromthese cells is then converted into cDNAs using either random primers oroligo dT to initiate the cDNA. The cDNAs obtained are PCR-amplifiedusing the sense and the antisense primers described herein as SEQ IDNOs:2 and 3. The PCR-amplified products are then subjected to agarosegel electrophoresis, and the gel is stained to visualize the nucleicacid bands. The presence of a 455 bp product is indicative of potentialcancer.

[0130] In addition, a method for diagnosing colon cancer using blood orblood-derived materials (e.g., serum) using the antibodies to the CCRGgene is an envisioned as the CCRG protein is predicted to be a secretedprotein. CCRG protein levels above the baseline due to the production ofthe CCRG protein by intestine cells would be indicative of colon cancerin the patients. The levels of secreted CCRG protein in the serum/plasmacan be measured by methods described elsewhere herein including, e.g.,Enzyme Linked Immunosorbent Assay (ELISA) or Western blotting.

Example 8 Detection of the CCRG Gene by Hybridization

[0131] Using hybridization techniques, CCRG gene expression can bedetected with the oligonucleotide probe described herein as SEQ ID NO:4.The oligonucleotide is labeled with radioactive or non-radioactivenucleotides, and the labeled probe is reacted with RNA from the samplebeing analyzed in the form of a Northern blot by transferring theproducts onto a filter (for example, nitrocellulose). This method canalso be performed in the form of Southern blot of RT-PCR reactionproducts made from the genomic DNA contained in a sample being analyzed.Following hybridization to the oligonucleotide probe, the filter iswashed, exposed to X-ray film, and auto-radiographed. Bands thathybridized to the probe can be identified from the autoradiogram. Theoligonucleotide probe can also be used for in situ hybridizationreactions to directly detect CCRG gene expression in tissues.

Example 9 Detection of Cancer Cells

[0132] A method for detecting cancer cells (e.g., metastatic cancercells) is specifically envisioned. The method involves obtaining atissue sample from a test subject (e.g., a cancer patient), optionallyisolating nucleic acid (e.g., by PCR amplification) or protein from thesample, probing the sample or isolated nucleic acid/protein with amolecule that specifically binds to CCRG genomic DNA, mRNA or cDNA, orthe corresponding polypeptide product (e.g., CCRG protein). For example,in one variation of this method, total RNA is isolated from cancer cellsobtained from fecal or peripheral blood samples. The RNA is thenanalyzed for the presence of CCRG mRNA by RT-PCR using theoligonucleotides of SEQ ID NOs:2 and 3 as primers. As another example,CCRG gene expression can be detected in the cells of these samples by insitu hybridization using SEQ ID NO:4 as a oligonucleotide probe. Asstill another example, antibodies specific for CCRG protein can be usedto probe cells samples directly (e.g., using conventionalimmunofluorescence, histochemical staining techniques) or can be used todetect CCRG protein by immunoprecipitation and electrophoresis, or byWestern blotting.

Example 10 CCRG as a Therapeutic Target

[0133] Inhibition of CCRG gene expression can be accomplished using anantisense nucleic acid. For example, a suitable length (e.g., 18-25bases) of an antisense nucleic acid that specifically hybridizes to the5′ prime-coding region of the CCRG gene is synthesized, and thenintroduced into target tissues or cells (e.g., by electroporation ordelivery via a vector) or liposomes. The target tissues or cells arethen placed under conditions that allow the anti-sense nucleic acid tohybridize to the mRNAs transcribed from the CCRG gene. Thishybridization prevents translation and thereby to selectively inhibitsexpression of CCRG protein. See, e.g., Narayanan, R. In Vivo, 8:787-794, 1994. As another example, the foregoing antisense nucleic acidcan also generated as a stable recombinant construct that can bedelivered in vivo for gene therapy. See, e.g., Higgins et al., ProcNat'l Acad Sci USA 90: 9901-9905, 1993.

[0134] In one variation of this example, the antisense nucleic acid isthe oligonucleotide shown as SEQ ID NO: 8 (i.e., 5′TCC TTG ATC TTC TTATCC ATA ACG 3′). This oligonucleotide can be substituted with variouscomponents at the nucleic acid backbone. Tumor-bearing patients can betreated with suitable formulations of this antisense oligonucleotide asdescribed, e.g., Narayanan R and Akhtar S., Curr Opin Oncol 8: 509-515,1996; Higgins et al., Proc Nat'l Acad Sci USA 90: 9901-9905, 1993; andNarayanan R, J. Nat'l. Cancer Inst. 89: 107-109, 1997. The antisenseoligonucleotide can be used alone or in combination with conventionalchemotherapy or radiotherapy protocols.

Example 11 CCRG as a Drug Discovery Target

[0135] A method of discovering drugs that selectively modulate CCRGprotein function is envisioned. In this method, an expression vectorincorporating a nucleic acid encoding a CCRG protein is introduced intoand expressed in a host cell under conditions that cause the CCRGprotein to be produced in the cell. The CCRG protein produced in thismanner is then purified so that it can be used in an in vitro highthroughput assay to screen for compounds that bind to it. Thosecompounds that bind the CCRG protein can be isolated and furthercharacterized. For example, such compounds could be tested for theability to inhibit the growth of CCRG expressing tumor-derived celllines in growth inhibition assays.

[0136] As another method for discovering drugs, a substance to bescreened can be added to a culture containing a cell expressing CCRG tosee if the substance modulates CCRG expression. In an alternativemethod, cell lines transfected with recombinant constructs containing areporter gene (e.g., those that encode chloramphenicolacetyltransferase, luciferase, beta-galactosidase, etc.) operably linkedto the CCRG promoter can be used to identify substances that inhibitexpression of the CCRG gene. For example, compounds that selectivelyinhibit expression of the reporter would be identified as a CCRGselective inhibitor.

[0137] As CCRG is selectively expressed in colon tumors; but not in avariety of other tumors, compounds can be screened for the ability toselectively inhibit growth of CCRG-expressing tumors. Compoundsidentified in this manner can be further evaluated for CCRG-specificinhibition using the CCRG promoter-reporter gene constructs describedabove.

Example 12 CCRG Receptor as a Drug Target

[0138] Since at least one form of CCRG protein is a secreted molecule,it is possible that there exists a cellular receptor to which CCRGprotein binds. Such a receptor can be identified by those skilled inart, for example by labeling a CCRG protein with a detectable label(e.g., radioactive iodine), and then using the labeled CCRG protein toidentify the receptor molecules present in the colon cancer cellmembranes. The identified receptor protein can be sequenced and, usingsuch sequence, the full length clone of such a receptor can be obtained.The cloned receptor can be used in screening assays to detect specificagonists/antagonists and the lead drugs can be tested in colon cancercells to determine whether the compound can inhibit the growth of thecolon cancers.

Example 13 Antibody Detection of CCRG

[0139] Tumor selective expression of a CCRG gene product can be detectedby measuring expression of CCRG protein using such techniques asimmunohistochemistry or immunofluorescence. As an example of the lattertechnique, paraffin-fixed sections of colon tumor and correspondingnormal tissues are analyzed using antibodies specific for CCRG protein.Immunohistochemical detection of CCRG protein is performed using thetechniques described in Scheurle et al., Anticancer. Res. 20:2091-2096,2000. In brief, the sections are deparaffinized in a xylene bath twotimes for five minutes, and then rehydrated through graded alcohols todistilled water. Slides are incubated with the primary anti-CCRGantibody. Bound primary antibody is detected by staining the sectionswith an enzyme labeled secondary antibody that specifically binds theprimary antibody. The slides are developed using a chromagen compatiblewith the enzyme label. The sections are counterstained with hematoxylin,dehydrated in ethanol, and mounted in Permount (Fisher Scientific).Using this method, CCRG protein expression should be detectable in colontumors but not in normal tissues. In view of the predicted secretednature of the CCRG protein, use of anti-CCRG antibodies in Western blotsor ELISAs is therefore specifically envisioned in methods for detectingCCRG protein in tissue samples as a diagnostic or prognostic assay forCCRG-associated malignancies.

OTHER EMBODIMENTS

[0140] This description has been by way of example of how thecompositions and methods of invention can be made and carried out. Thoseof ordinary skill in the art will recognize that various details may bemodified in arriving at the other detailed embodiments, and that many ofthese embodiments will come within the scope of the invention.

[0141] Therefore, to apprise the public of the scope of the inventionand the embodiments covered by the invention, the following claims aremade.

1 13 1 576 DNA homo sapiens 1 tgaggtacaa agtttgtctt tattacccaagaatcaggaa tggaacaaat gaagtgggac 60 gtttgagtta gatttcttgg ttgggaccctggtttcatta ctgtcatggt cacaaactga 120 gttctcagcc tcctccctgt caggtcaggtggcagcagcg ggcagtggtc cagtccacca 180 cactgcactg gcagtggcag gtggtttccagctgaacatc ccacgaacca cagccatagc 240 cacaagcaca gccagtgaca gccatcccagcagggcagtg aggacggtct gccttggctt 300 ttgacactag cacacgagag cttcttgcttataggagagg gactgtactc tagactgttg 360 agaacatcct tgatcttctt atccataacggagtctaagg aacactgagt actccccggg 420 ttgatcagct ggagaagggg gattaggatgagaaggaggc aagaggacgg ccccatcctg 480 tacagagtca gtgtcctggg gctgggggaaagatggaaag agcttagatc tctgagccct 540 gggtggtggt gaggaaagaa gacacgtggctcgtgc 576 2 22 DNA homo sapiens 2 gagttctcag cctcctccct gt 22 3 21 DNAhomo sapiens 3 cgagccacgt gtcttctttc c 21 4 20 DNA homo sapiens 4acaagcacag ccagtgacag 20 5 744 DNA homo sapiens 5 gcctcagaca gtggttcaaagtttttttct tccatttcag gtgtcgtgaa aagcttgaat 60 tcggcgcgcc agatatcacacgtgccaagg ggctggctca aataaatctg ttcttcagca 120 accctacctg cttctccaaactgcctaaag agatccagta ctgatgacgc tgttcttcca 180 tctttactcc ctggaaactaaccacgttgt cttctttcct tcaccaccac ccaggagctc 240 agcgatctaa gctgctttccatcttttctc ccagccccag gacactgact ctgtacagga 300 tggggccgtc ctcttgcctccttctcatcc taatccccct tctccagctg atcaacccgg 360 ggagtactca gtgttccttagactccgtta tggataagaa gatcaaggat gttctcaaca 420 gtctagagta cagtccctctcctataagca agaagctctc gtgtgctagt gtcaaaagcc 480 aaggcagacc gtcctcctgccctgctggga tggctgtcac tggctgtgct tgtggctatg 540 gctgtggttc gtgggatgttcagctggaaa ccacctgcca ctgccagtgc agtgtggtgg 600 actggaccac tgcccgctgctgccacctga cctgacaggg aggaggctga gaactcagtt 660 ttgtgaccat gacagtaatgaaaccagggt cccaaccaag aaatctaact caaacgtccc 720 actttatttg ttvvattcatttgt 744 6 887 DNA homo sapiens 6 gcctcagaca gtggttcaaa gtttttttcttcctttcagg tgtcgtgaaa agcttgaatt 60 cggcgcgcca gatatcacac gtgccaaggggctggctcaa ataaatctgt tcttcagcaa 120 ccctacctgc ttctccaaaa ctgcctaaagagatccagta ctgatgacgc tgttcttcca 180 tctttactcc ctggaaacta accacgttgtcttctttcct tcaccaccac ccaggagctc 240 agagagatct aagctgcttt ccatcttttctcccagcccc aggacactga ctctgtacag 300 gatggggccg tcctcttgcc tccttctcatcctaatcccc cttctccagc tgatcaaccc 360 ggggagtact cagtgttcct tagactccgttatggataag aagatcaagg atgttctcaa 420 cagtctagag tacagtccct ctcctataagcaagaagctc tcgtgtgcta gtgtcaaaag 480 ccaaggcaga ccgtcctcct gccctgctgggatggctgtc actgctgtgc ttgtggctat 540 ggctgtggtt cgtgggatgt tcagctggaaaccaccctgc cactgccagt gcagtgtggt 600 ggactggacc actgcccgac tgctgccacctgacctgaca gggaggaggc tgagactcag 660 ttttgtgacc atgacagtaa tgaaaccagggtcccaacca agaaatctaa ctcaaacgtc 720 cacttcattt gttccattcc tgattcttgggtaataaaga caaactttgt acctctcaaa 780 aaaaaaaaaa aaaaagtatt tcattacctctttctccgca cctggcctgc agccggccgc 840 aggtaagcca gcccaggcct cgccctccagctaaggcggg acagggc 887 7 111 PRT homo sapiens 7 Met Gly Pro Ser Ser CysLeu Leu Leu Ile Leu Ile Pro Leu Leu Gln 1 5 10 15 Leu Ile Asn Pro GlySer Thr Gln Cys Ser Leu Asp Ser Val Met Asp 20 25 30 Lys Lys Ile Lys AspVal Leu Asn Ser Leu Glu Tyr Ser Pro Ser Pro 35 40 45 Ile Ser Lys Lys LeuSer Cys Ala Ser Val Lys Ser Gln Gly Arg Pro 50 55 60 Ser Ser Cys Pro AlaGly Met Ala Val Thr Gly Cys Ala Cys Gly Tyr 65 70 75 80 Gly Cys Gly SerTrp Asp Val Gln Leu Glu Thr Thr Cys His Cys Gln 85 90 95 Cys Ser Val ValAsp Trp Thr Thr Ala Arg Cys Cys His Leu Thr 100 105 110 8 24 DNAARTIFICIAL SEQUENCE Antisense oligonucleotide 8 tccttgatct tcttatccataacg 24 9 22 DNA Homo sapiens 9 ggccgtcctc ttgcctcctt ct 22 10 24 DNAHomo sapiens 10 ggtttccagc tgaacatccc acga 24 11 21 DNA ARTIFICIALSEQUENCE pEAK vector primer 11 ggatctttgg ttcattctca a 21 12 20 DNAARTIFICIAL SEQUENCE pEAK vector primer 12 ctggatgcag gctactctag 20 13 30PRT Homo sapiens 13 Met Gly Pro Ser Ser Cys Leu Leu Leu Ile Leu Ile ProLeu Leu Gln 1 5 10 15 Leu Ile Asn Pro Gly Ser Thr Gln Cys Ser Leu AspSer Val 20 25 30

What is claimed is:
 1. A purified nucleic acid present at higher levelsin colon cancer cells than in non-cancerous colon cells, said purifiednucleic acid comprising a nucleotide sequence that encodes a polypeptidesharing at least 80% sequence identity with SEQ ID NO:7 or with afragment of SEQ ID NO:7 at least 20 residues in length.
 2. The nucleicacid of claim 1, wherein the nucleotide sequence defines apolynucleotide whose complement hybridizes under high stringencyconditions to the nucleotide sequence of SEQ ID NO:6.
 3. The nucleicacid of claim 1, wherein the polypeptide has an amino acid sequenceconsisting of SEQ ID NO:7 or a fragment of SEQ ID NO:7 at least 20residues in length.
 4. The nucleic acid of claim 1 comprising a fragmentof the polynucleotide sequence of SEQ ID NO:6 at least 50 residues long.5. The nucleic acid of claim 4 comprising the polynucleotide sequence ofSEQ ID NO:6.
 6. A vector comprising a purified nucleic acid present athigher levels in colon cancer cells than in non-cancerous colon cells,said purified nucleic acid comprising a nucleotide sequence that encodesa polypeptide sharing at least 80% sequence identity with SEQ ID NO:7 orwith a fragment of SEQ ID NO:7 at least 20 residues in length.
 7. Thevector of claim 6, wherein said nucleic acid is operably linked to oneor more expression control sequences.
 8. A cell comprising a vectorcomprising a purified nucleic acid present at higher levels in coloncancer cells than in non-cancerous colon cells, said purified nucleicacid comprising a nucleotide sequence that encodes a polypeptide sharingat least 80% sequence identity with SEQ ID NO:7 or with a fragment ofSEQ ID NO:7 at least 20 residues in length
 9. A probe comprising anoligonucleotide and a detectable label attached to the oligonucleotide,the oligonucleotide being at least 15 nucleotides in length andhybridizing under high stringency conditions to the nucleotide sequenceof SEQ ID NO:7 or a complement of the nucleotide sequence of SEQ IDNO:7.
 10. A kit for detecting a purified nucleic acid comprising anucleotide sequence that encodes a polypeptide sharing at least 80%sequence identity with SEQ ID NO:7 or with a fragment of SEQ ID NO:7 atleast 20 residues in length in a cell, the kit comprising: a first PCRprimer comprising a first nucleic acid molecule comprising thenucleotide sequence of SEQ ID NO:2 or SEQ ID NO:9, and a second PCRprimer comprising a second nucleic acid molecule comprising thenucleotide sequence of SEQ ID NO:3 or SEQ ID NO:
 10. 11. A purifiedpolypeptide expressed at higher levels by colon cancer cells than bynon-cancerous colon cells, said purified polypeptide comprising an aminoacid sequence that shares at least 80% sequence identity with SEQ IDNO:7 or a fragment of SEQ ID NO:7 at least 20 residues in length. 12.The purified polypeptide of claim 11 comprising a fragment of SEQ IDNO:7 at least 20 residues in length.
 13. The purified polypeptide ofclaim 12 comprising residues 31-111 of the amino acid sequence of SEQ IDNO:7.
 14. The purified polypeptide of claim 13 comprising the amino acidsequence of SEQ ID NO:7.
 15. A purified antibody that specifically bindsto a polypeptide comprising an amino acid sequence that shares at least80% sequence identity with SEQ ID NO:7 or a fragment of SEQ ID NO:7 atleast 20 residues in length.
 16. The antibody of claim 15, furthercomprising a detectable label.
 17. A method of producing a CCRGpolypeptide comprising the steps of: (a) providing a cell transformedwith a purified nucleic acid comprising a nucleotide sequence thatencodes a CCRG polypeptide sharing at least 80% sequence identity withSEQ ID NO:7; (b) culturing the cell under conditions that allowexpression of the CCRG polypeptide; and (c) collecting the CCRGpolypeptide from the cultured cell.
 18. A screening method foridentifying a substance that modulates expression of a gene encoding aCCRG polypeptide sharing at least 80% sequence identity with SEQ IDNO:7, the method comprising the steps of: (a) providing a test cell thatincludes the gene encoding a CCRG polypeptide sharing at least 80%sequence identity with SEQ ID NO:7; (b) contacting the test cell with acandidate substance; and (c) detecting an increase or decrease in theexpression level of the gene encoding the CCRG polypeptide in thepresence of the candidate substance, compared to the expression level ofthe gene encoding CCRG polypeptide in the absence of the candidatesubstance, as an indication that the candidate substance modulates thelevel of expression of the gene encoding the CCRG polypeptide.
 19. Amethod for isolating a substance that binds a CCRG polypeptide sharingat least 80% sequence identity with SEQ ID NO:7 comprising the steps of:(a) providing a sample of the CCRG polypeptide immobilized on asubstrate; (b) contacting a mixture containing the CCRGpolypeptide-binding substance with the immobilized CCRG polypeptide; (c)separating unbound components of the mixture from bound components ofthe mixture; and (d) recovering the CCRG polypeptide-binding substancefrom the immobilized CCRG polypeptide.
 20. A method for detecting thepresence of a CCRG nucleic acid or polypeptide in a biological samplecomprising the steps of: (a) providing the biological sample; and (b)detecting the presence of the CCRG nucleic acid or polypeptide in thebiological sample.
 21. The method of claim 20, wherein the step (b) ofdetecting the presence of the CCRG nucleic acid or polypeptide in abiological sample comprises: contacting the biological sample with aprobe that binds to the CCRG nucleic acid or polypeptide; and detectingbinding of the probe to the biological sample.
 22. The method of claim20, wherein the step (b) of detecting the presence of the CCRG nucleicacid or polypeptide in a biological sample comprises: isolating RNA fromthe biological sample; generating cDNAs from the isolated RNA;contacting said cDNAs with a first PCR primer that hybridizes to a firstportion of a polynucleotide sharing at least 80% sequence identity withSEQ ID NO:6 or a complement of SEQ ID NO:6, and a second PCR primer thathybridizes to a second portion of a polynucleotide sharing at least 80%sequence identity with SEQ ID NO:6 or a complement of SEQ ID NO:6 toform a mixture; subjecting the mixture to reversetranscriptase-polymerase chain reaction to generate PCR amplificationproducts; and analyzing said PCR amplification products by gelelectrophoresis.
 23. The method of claim 20, wherein the biologicalsample is a cell derived from a colon.
 24. The method of claim 23,wherein said colon is a human colon.
 25. The method of claim 20, whereinthe biological sample is feces or urine.
 26. The method of claim 20,wherein the biological sample is selected from the group consisting ofblood, plasma, and serum.
 27. A method for detecting the presence of acolon cancer cell in a biological sample, the method comprising thesteps of: (a) providing the biological sample; and (b) analyzing thebiological sample for the presence of a molecule selected from the groupconsisting of: a nucleic acid at least 15 nucleotides in length thathybridizes under stringent conditions to the nucleic acid of SEQ ID NO:6or the complement of SEQ ID NO:6, and a polypeptide sharing at least 80%sequence identity with SEQ ID NO:7, wherein presence of the molecule inthe biological sample indicates that the sample contains a colon cancercell.
 28. The method of claim 27, wherein the biological sample is acolon tissue sample.
 29. The method of claim 27, wherein the biologicalsample is selected from the group consisting of: feces, urine, and,peripheral blood.
 30. A method for detecting the presence of a CCRGprotein in a biological sample, the method comprising the steps of: (a)providing the biological sample; and (b) analyzing the biological samplefor the presence of a polypeptide comprising an amino acid sequence thatshares at least 80% sequence identity with SEQ ID NO:7 or a fragment ofSEQ ID NO:7 at least 20 residues in length, wherein presence of thepolypeptide in the biological sample indicates that the sample containsthe CCRG protein.
 31. The method of claim 30, wherein the biologicalsample is a colon tissue sample.
 32. The method of claim 30, wherein thebiological sample is selected from the group consisting of: feces,urine, and, peripheral blood.
 33. The method of claim 30, wherein thestep (b) of analyzing the biological sample for the presence of apolypeptide comprising an amino acid sequence that shares at least 80%sequence identity with SEQ ID NO:7 or a fragment of SEQ ID NO:7 at least20 residues in length comprises contacting the biological sample with anantibody that specifically binds to a polypeptide comprising an aminoacid sequence that shares at least 80% sequence identity with SEQ IDNO:7 or a fragment of SEQ ID NO:7 at least 20 residues in length.
 34. Amethod for identifying a cellular receptor for a CCRG protein, themethod comprising the steps of: providing a cell membrane suspected ofhaving a cellular receptor for a CCRG protein; contacting said cell witha polypeptide comprising an amino acid sequence that shares at least 80%sequence identity with SEQ ID NO:7 or a fragment of SEQ ID NO:7 at least20 residues in length, whereby said polypeptide binds said cellularreceptor to form a polypeptide-receptor complex; and isolating saidcomplex.
 35. The method of claim 34, further comprising the step ofseparating said cellular receptor from said polypeptide in saidpolypeptide-receptor complex.
 36. The method of claim 35, furthercomprising the step of analyzing said receptor.
 37. The method of claim36, wherein said step of analyzing said receptor comprises determiningthe amino acid sequence of said receptor.
 38. The method of claim 34,wherein said cell membrane is derived from a colon cancer cell.
 39. Amethod for identifying a molecule that modulates the function of acellular receptor for a CCRG protein, the method comprising the stepsof: providing a cellular receptor for a CCRG protein; contacting saidcellular receptor with a test molecule; and analyzing whether saidcontacting step results in modulation of a function of said cellularreceptor.